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00 (54) Title: SYSTEMS AND METHODS FOR TESTING A BIOLOGICAL SAMPLE 

£J (57) Abstract: Systems and methods for testing samples, particularly biological samples are provided. The system includes an in- 
strument fur detecting molecules in samples, and a processor that communicates with the instrument to provide results-based control 

Q of the instrument to effect assay-basd judging. For example, a system, including software, is provided tliat directs and performs 
assays such as diagnostic assays that employ a mass spectrometer. The output of the system, rather than a mass spectrum or other 

^ raw data form, is the diagnostic outcome, such as a genotype. 
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SYSTEMS AND METHODS FOR TESTING A BIOLOGICAL SAMPLE 
RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. application Serial No. 09/839,629, 
entitled "SYSTEM AND METHOD FOR TESTING A BIOLOGICAL SAMPLE" to 
David Opalsky, Ping Yip and Kishorchandra Bhakta, filed April 20 r 2001. Where 
5 permitted, the subject matter of this application is incorporated herein in its 
entirety. 

TECHNICAL FIELD 

Mass spectrometry-based methods, products and systems for testing 
samples, such as biological samples are provided. In one example, a system 
10 having a processor is used to implement a disclosed testing method. 
BACKGROUND 

Instruments, such as the mass spectrometer, are now routinely used to 
assist in identifying components of a biological sample. In particular, the MALDI- . 
TOF (matrix-assisted desorption ionization time-of-flight) mass spectrometer has 

15 proven useful for making biological determinations, such as genotyping or 
identifying single nucleotide polymorphisms. 

The MALDI TOF mass spectrometer generally operates by directing an 
energy beam at a target spot on a biological sample. The energy beam 
disintegrates the biological material at the target spot, with the disintegrated 

20 component material hurled toward a measurement module. The lighter 

component material arrives at the measurement module before the heavier 
component material. The measurement module captures the component 
material, and generates a data set indicative of the mass of the component 
material sensed. Typically, the data set is generated as a two dimensional 

25 spectrum, with the x-axis representing a mass number, and the y-axis 
representing a quantity number. 

The data, which is often presented as a data spectrum, typically has 
peaks positioned on a generally exponentially decaying baseline. Each peak 
ideally should represent the presence of a component of the biological sample. 

30 Unfortunately, due to chemical and mechanical limitations, the data spectrum is 
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replete with noise, so an accurate determination of biological components can be 
challenging. Indeed, it takes an experienced operator to accurately read and 
interpret a data spectrum. Efforts of even the best trained human operator can 
suffer from inaccuracies and errors. Since the results derived from data spectra 
5 often are used in health care decisions, mistakes can be devastating. Therefore, 
operators are trained to make a determination only when certain of the result. In 
such a manner, a great number of tests result in no-calls, where the operator 
cannot clearly identify a data result. 

Accordingly, the use of mass spectrometers risks an unacceptably large 

10 number of inaccurate calls, if the operator is applying a rather loose standard to 
the data spectrum. Alternatively, the use of mass spectrometers becomes highly 
inefficient if the operator discards a large number of tests due to an inability to 
confidently make a call. 

To assist the operator in making calls, the mass spectrometer can provide 

15 a level of data filtering. Typically, the data filtering attenuates a set magnitude 
of noise, thereby more conspicuously exposing valid peaks. Such a filtering 
technique actually can mask important valid peaks, resulting in an incorrect 
analysis. 

Modern trends in biotechnology are taxing the capabilities of instruments 
20 such as mass spectrometers and their operators. For example, mass 

spectrometers are now used to identify single nucleotide polymorphisms (SNPs). 
SNPs can produce only slight peaks on the data spectrum, which are easily 
missed by an operator or buried in background noise. Further, mass 
spectrometers are also used for multiplexing, where multiple gene reactions can 
25 be performed in a single sample. In such a manner, the resulting peaks can be 
smaller, more difficult to identify, and there can be more combinations of false 
readings. With such complicated data spectra it is becoming more difficult for 
an operator to confidently determine if a valid peak exists for a particular genetic 
component. 

30 In addition, the mass spectrometer data collection process can be 

unnecessarily prolonged for a sample. This can occur, for example, when a 
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"raster" technique is used to repeatedly acquire spectrum output from a sample 
until output indicates satisfactory data was received. Inaccurate analysis of 
spectrum data can cause satisfactory output to be unrecognized, resulting in 
unnecessary rastering to continue collecting additional data. 
5 As tests become more complex and the demand for high throughput 

outputs increase, the mass spectrometer can provides data spectra that are 
difficult for an operator to interpret. Even under the best of conditions, the 
operator can make identifications where a call should not have been made, or is 
can discard good acquired data because of perceived ambiguity. Accordingly, 
10 there exists a need for a more efficient and accurate method and system for 

identifying samples, including biological sample are provided. Therefore, among 
the objects herein, it is an object herein to provide methods, products and 
systems to meet such need. 
SUMMARY 

15 Testing systems and methods that exhibit efficient and accurate biological 

identification of instrument output, such as mass spectrometer output, are 
provided. The systems and methods provided herein employ assay-based 
judging, in the ultimate biological significance of the data are feedback to the 
data acquisition routines and instrument, to improve the performance of 

20 biological assays, including those involving multiplex assays. In multiplex 
formats, each assay is treated as an in individual test generating a separate 
results. In the systems herein and in accord with the methods provided herein, 
biological results are displaced in real time and on the user interface of a test 
instrument, such as a mass spectrometry control system. The methods and 

25 systems herein are designed to provide high speed and high throughput tests; 
only needed data are acquired thereby eliminating time spent blindly acquiring 
unnecessary data. 

In particular, systems and methods for implementing the systems for 
obtaining and displaying results of assays, generally real-time (RT) results, are 

30 provided. The systems and methods are ideal for high throughput formats, in 
which a plurality of samples, typically at least about 96, 384, 1534 and higher 
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numbers of samples, are tested. The samples can be biological samples and can 
include identical samples on which a plurality of tests are performed and samples 
from a plurality of different sources in which one or a plurality of tests are 
performed or any combination thereof. The systems include an instrument, 
5 such as a mass spectrometer, IMMR instrument, spectrometer, gas 

chromatograph, high pressure liquid chromatograph (HPLC), or combinations 
thereof, for data acquisition, and a processor. The process directs operation of 
the instrument and the assays performed thereby and includes software 
(routines) for data collection (data acquisition), and data processing routines to 
10 assess the collected data. Methods employing the systems are provided. In the 
systems and methods provided herein, the results of the tests performed by the 
assays and the assays, i.e., that data collection routine and data processing 
routine, are integrated so that the real-time results are used in directing the data 
acquisition. 

15 Mass spectrometry systems are also provided. These systems use 

biology-based results to control data acquisition in the mass spectrometer 
thereby significantly improving call efficiency and increasing the instrument 
throughput. The exemplified system includes highly optimized versions of calling 
algorithms with a streamlined interface to a database to store the results, such 

20 as genotyping results. As part of the optimization, a well- defined programming 
interface that controls the dialog between the data acquisition component and 
the biological-calling component of spectra analysis was developed. The 
interface is flexible and modular to allow modification of the calling algorithms. 
The interface that controls the dialogue between the data acquisition component 

25 and the biological calling component is provided. 

It has been observed that calling efficiency can be improved by over 50% 
using the techniques provided herein. The improvement has been found to 
depend on the quality of the assay and the level of multiplexing. 

Systems for data acquisition and analysis are provided. The systems 

30 include a computer-directed data collection processing routine; and a computer- 
directed data processing routine. The data collection and data processing 
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routines are integrated so that tests are performed on a sample and the output 
from an instrument that includes such integrated software is a diagnosis. Thus, 
for example, the software, systems and methods provided herein converts a 
mass spectrometer into a detector is a system that displays biological results, 
5 particularly, real time biological results, such as a genotype and allelic frequency. 
Methods for testing samples, such as biological samples, are provided. 
These methods use a testing system that includes a processor and an instrument 
that is configured to acquire data from a sample, such as a biological sample. In 
performing the testing method, the instrument acquires data from the sample, 

10 and the processor compares the acquired data to predefined data criteria. 

Responsive to comparing the acquired data to the data criteria, the instrument 
can be adjusted, and another data set acquired. In one disclosed example of the 
testing system, a mass spectrometer acquires data from a biological sample. 
The acquired data are compared to predefined spectrum criteria. Responsive to 

15 the comparison, the mass spectrometer cam be directed to resample the 
biological sample or proceed to another sample. 

Advantageously, the disclosed methods and systems for testing samples, 
such as a biological sample, provides automated control of an instrument, such 
as a mass spectrometer, a gas chromatograph, electrophoresis apparatus, NMR 

20 instruments and other instruments and combinations thereof, and permits direct 
readout of results of a test rather than readout of instrument output, such as a 
mass spectrum. More particularly, the testing method provides a highly accurate 
determination of a sample with minimal manual intervention. Accordingly, 
samples can be identified and diagnostic tests performed with a high degree of 

25 precision, speed, and accuracy. 

Software and computer-readable media containing such software are 
provided. Processors and diagnostic systems that include such software and 
instruments that employ the software to direct processing of samples, such as 
mass spectrometric analysis of molecules in the samples are provided. The 

30 software provided herein converts an instrument, such as a mass spectrometer, 
into a detector that displays biological results, particularly, real time biological 
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results, such as a genotype and/or allelic frequency. Instruments, such as a 
mass spectrometer, that displays biological results are provided. 
BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a testing system provided herein; 
5 FIG. 2 is a flowchart of a testing process provided herein; 

FIG. 3 is a flowchart of a testing process provided herein that illustrates 
automated control of a testing instrument; 

FIG. 4 is a flowchart of a testing process provided herein that illustrates 
an assay-based judging process provided herein; 
10 FIG. 5 is a flowchart of a testing process that illustrates acquiring data 

from multiple samples to establish the presence of a biological relationship; and 

FIG. 6 is an illustration of a computer display showing results from a 
testing system. 

FIG. 7 is a block diagram of an exemplary testing system provided herein, 
15 in which the instrument is a mass spectrometer (it is understood that the mass 
spectrometer is exemplary only and can be replaced by any such data acquisition 
instrument). 

FIG. 8 is a flowchart that illustrates another embodiment of the assay- 
based judging processes provided herein in which performance history is 
20 assessed and increase throughput speed. 

FIG. 9 is a diagram that shows an exemplary mass spectrum acquired 
and data outcome from data acquisition using prior methods without the biology- 
dependent rastering control (assay-based judging). 

FIG. 10 is a diagram that shows a mass spectrum acquired and data 
25 outcome from data acquisition using the biology-dependent rastering control 
(assay-based judging) as provided herein. 
DETAILED DESCRIPTION 
A. Definitions 

Unless defined otherwise, all technical and scientific terms used herein 
30 have the same meaning as is commonly understood by one of skill in the art to 
which the inyention(s) belong. All patents, patent applications, published 
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applications and publications, Genbank sequences, websites and other published 
materials referred to throughout the entire disclosure herein, unless noted 
otherwise, are incorporated by reference in their entirety. In the event that there 
are a plurality of definitions for terms herein, those in this section prevail. 
5 Where reference is made to a URL or other such identifier or address, it 

understood that such identifiers can change and particular information on the 
internet can come and go, but equivalent information can be found by searching 
the internet. Reference thereto evidences the availability and public 
dissemination of such information*. 

10 Among the issued patents and published international applications 

incorporated by reference and that describe methods that can be adapted for use 
with the methods and systems provided herein, are: U.S. Patent Nos. 
5,807,522, 6,110,426, 6,024,925, 6,133,436, 5,900,481 , 6,043,031 , 
5,605,798, 5,691,141, 5,547,835, 5,872,003, 5,851,765, 5,622,824, 

15 6,074,823, 6,022,688, 6,111,251, 5,777,324, 5,928,906, 6,225,450, 

6,146,854, 6,207,370, U.S. application Serial No. 09/663,968, International 
PCT application No. WO 99/12040, WO 97/42348, WO 98/20020, WO 
98/20019, WO 99/57318, WO 00/56446, WO 00/60361 and WO 02/25567. 
These patents and publications describe a variety of mass spectrometric 

20 analytical methods, substrates and matrices used in mass spectrometric 

analyses, and related methods and apparatus, including pin tools and other 
dispensing systems. It is intended that the methods, products and systems 
provided herein can be adapted for use with the methods and products described 
and used in these patents and patent applications as well as other such methods 

25 that employ instruments for detection of molecules and computer-directed 

assays, and are particularly suitable for use in high throughput formats. Other 
intended uses include any methods and assays that have an instrument for data 
acquisition and that employ data-typing analyses. 

As used herein, assay-based judging refers to a method in which 

30 decisions regarding further sampling or testing of the assays are based on the 
ultimate results, the biological significance (i.e., the biological result such as a 
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genotype), rather than specific results of data acquisition by an instrument (and 
related software), such as a mass spectrum or chromatography from an 
instrument. 

As used herein, assay results refer to the output from a particular 
5 protocol, such as for example, a mass spectrum for molecules in a sample. 

As used herein, ultimate results are the actual determination, such as a 
genotype or other diagnosis, achieved by the sampling. 

As used herein, a programming interface refers to specifications for 
programming communications, such as application programming interfaces 
10 (API's) and communication protocols that permit data exchange and transfer 
between programs and devices, such as instruments. These include, for 
example, API's for the Microsoft Windows® operating system and for TCP/IP 
communications. 

As used herein, "good" with reference to data and/or results means that 
15 the skill artisan would use the data or the results to reach a conclusion or would 
not discrd such data or results. Whether data or resuls are good is function of 
the particular data and/or results and will be apparent to the skilled artisan 
familiar with such data and/or results and the related technologies. 

As used herein, a call refers to identification of a data result, such as a 
20 genotype or a diagnosis or allotype. 

As used herein, an assay design refers to the instructions for effecting a 
protocol to perform an assay, such as a diagnostic test, including those involving 
genotyping. 

As used herein, real-time (RT) control refers to the ability of a RT 
25 workstation to receive data from the data acquisition instrument, such as a mass 
spectrometer, to process the data and provide command direction to the 
instrument, such as a mass spectrometer, in an automated manner. 

As used herein, a data collection routine refers to a process, that can be 
embodied in software, that controls data acquisition by an instrument, such as a 
30 mass spectrometer, refers to a process, typically an automated computer- 
controlled process, that directs the instrument in collection of data and 
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determines if output data, such as mass data from a mass spectrometry, is of 
suitable quality for analysis. For example, the data collection routine can assess 
signal-to-noise ratios. 

As used herein, a data processing routine refers to a process, that can be 
5 embodied in software, that determines the biological significance of acquired 
data (i.e., the ultimate results of the assay). For example, the data processing 
routine can make a genotype determination based upon the data collected. In 
the systems and methods herein, the data processing routine also controls the 
instrument and/or the data collection routine based upon the results determined. 

10 The data processing routine and the data collection routines are integrated and 
provide feedback to operate the data acquisition by the instrument, and hence 
provide the assay-based judging methods provided herein. 

As used herein, "sample" refers to a composition containing a material, 
such as a molecule, to be detected. In an exemplary embodiment, the sample is 

15 a "biological sample" (i.e., any material obtained from a living source (e.g. 

human, animal, plant, bacteria, fungi, protist, virus). The biological sample can 
be in any form, including solid materials (e.g. tissue, cell pellets and biopsies) 
and biological fluids (e.g. urine, blood, saliva, amniotic fluid and mouth wash 
(containing buccal cells)). Solid materials typically are mixed with a fluid. In 

20 particular, herein, the sample refers to a mixture of matrix used or mass 
spectrometric analyses and biological material such as nucleic acids. 

As used herein, a "biological sample" refers to material that can be 
derived from a living source. Such samples include, biomolecules and 
biopolymers. The molecules can be treated, such as by amplification, cloning 

25 and subcloning, and isolation processes prior to assessment. 

As used herein, a molecule refers to any molecule or compound that is 
linked to or contained on or in a well or other indentation on or in a solid 
support, such as a chip. Typically such molecules are macromolecules or 
components or precursors thereof, such as peptides, proteins, small organics, 

30 oligonucleotides or monomeric units of the peptides, organics, nucleic acids and 
other macromolecules. A monomeric unit refers to one of the constituents from 
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which the resulting compound is built. Thus, monomelic units include, 
nucleotides, amino acids, and pharmacophores from which small organic 
molecules are synthesized. 

As used herein, macromolecule refers to any molecule having a molecular 
5 weight from the hundreds up to the millions. Macromolecules include peptides, 
proteins, nucleotides, nucleic acids, and other such molecules that are generally 
synthesized by biological organisms, but can be prepared synthetically or using 
recombinant molecular biology methods. 

As used herein, a biopolymer includes, but is not limited to, nucleic acid, 

10 proteins, polysaccharides, lipids and other macromolecules. Nucleic acids 
include DNA, RNA, and fragments thereof. Nucleic acids can be isolated or 
derived from genomic DNA, RNA, mitochondrial nucleic acid, chloroplast nucleic 
acid and other organelles with separate genetic material or can be prepared 
synthetically. Thus, the term "biopolymer" is used to mean a biological 

15 molecule, including macromolecules, composed of two or more monomeric 

subunits, or derivatives thereof, which are linked by a bond or a macromolecule. 
A biopolymer can be, for example, a polynucleotide, a polypeptide, a 
carbohydrate, or a lipid, or derivatives or combinations thereof, for example, a 
nucleic acid molecule containing a peptide nucleic acid portion or a glycoprotein, 

20 respectively. The methods and systems herein, though described with reference 
to biopolymers, can be adapted for use with other synthetic schemes and 
assays, such as organic syntheses of pharmaceuticals, or inorganics and any 
other reaction or assay performed on a solid support or in a well in nanoliter or 
smaller volumes. 

25 As used herein, labels include any composition or moiety that can be 

attached to or incorporated into nucleic acid that is detectable by spectroscopic, 
photochemical, biochemical, immunochemical, electrical, optical or chemical 
means. Exemplary labels include, but are not limited to, biotin for staining with 
labeled streptavidin conjugate, magnetic beads (e.g., DynabeadsTM}, fluorescent 

30 dyes (e.g., 6-FAM, HEX, TET, TAMRA, ROX, JOE, 5-FAM, R1 10, fluorescein, 
texas red, rhodamine, phycoerythrin , lissamine, phycoerythrin (Perkin Elmer 
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Cetus), Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX (Amersham), radiolabels, 
enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others used in 
ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic 
(e.g., polystyrene, polypropylene, latex and other supports) beads, a fluorophore, 
5 a radioisotope or a chemiluminescent moiety. 

As used herein, a biological particle refers to a virus, such as a viral 
vector or viral capsid with or without packaged nucleic acid, phage, including a 
phage vector or phage capsid, with or without encapsulated nucleotide acid, a 
single cell, including eukaryotic arid prokaryotic cells or fragments thereof, a 

10 liposome or micellar agent or other packaging particle, and other such biological 
materials. For purposes herein, biological particles include molecules that are not 
typically considered macromolecules because they are not generally synthesized, 
but are derived from cells and viruses. 

As used herein, the term "nucleic acid" refers to single-stranded and/or 

15 double-stranded polynucleotides such as deoxyribonucleic acid (DNA), and 

ribonucleic acid (RNA) as well as analogs or derivatives of either RNA or DNA, 
such as peptide nucleic acid (PNA), phosphorothioate DNA, and other such 
analogs and derivatives or combinations thereof. Thus, as used herein, nucleic 
acids include DNA, RNA and analogs thereof, including protein nucleic acids 

20 (PNA) and mixture thereof. When referring to probes or primers, optionally 
labeled with a detectable label, such as a fluorescent or radiolabel, single- 
stranded molecules are contemplated. Such molecules are typically of a length 
such that they are statistically unique or low copy number (typically less than 5 
or 6, generally less than 3 copies in a library) for probing or priming a library. 

25 Generally a probe or primer contains at least 14, 16 or 30 contiguous 

nucleotides from a selected sequence thereof complementary to or identical to a 
polynucleotide of interest. Probes and primers can be 10, 14, 16, 20, 30, 50, 
100 or more nucleic acid bases long. 

As used herein, the term "polynucleotide" refers to an oligomer or 

30 polymer containing at least two linked nucleotides or nucleotide derivatives, 
* including a deoxyribonucleic acid (DNA), a ribonucleic acid (RNA), and a DNA or 
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RNA derivative containing, for example, a nucleotide analog or a "backbone" 
bond other than a phosphodiester bond, for example, a phosphotriester bond, a 
phosphoramidate bond, a phophorothioate bond, a thioester bond, or a peptide 
bond (peptide nucleic acid). The term "oligonucleotide" also is used herein 
5 essentially synonymously with "polynucleotide," although those in the art 
recognize that oligonucleotides, for example, PCR primers, generally are less 
than about fifty to one hundred nucleotides in length. 

Nucleotide analogs contained in a polynucleotide can be, for example, 
mass modified nucleotides, which allows for mass differentiation of 

10 polynucleotides; nucleotides containing a detectable label such as a fluorescent, 
radioactive, luminescent or chemiluminescent label, which allows for detection of 
a polynucleotide; or nucleotides containing a reactive group such as biotin or a 
thiol group, which facilitates immobilization of a polynucleotide to a solid 
support. A polynucleotide also can contain one or more backbone bonds that 

15 are selectively cleavable, for example, chemically, enzymatically or 
photolytically. For example, a polynucleotide can include one or more 
deoxyribonucleotides, followed by one or more ribonucleotides, which can be 
followed by one or more deoxyribonucleotides, which is cleavable at the 
ribonucleotide sequence by base hydrolysis. A polynucleotide also can contain 

20 one or more bonds that are relatively resistant to cleavage, for example, a 

chimeric oligonucleotide primer, which can include nucleotides linked by peptide 
nucleic acid bonds and at least one nucleotide at the 3' end, which is linked by a 
phosphodiester bond, or other such bond or linkage, and can be extended by a 
polymerase. Peptide nucleic acid sequences can be prepared using well known 

25 methods {see, for example, Weiler et aL, Nucleic acids Res. 25:2792-2799 
(1997)). 

A polynucleotide can be a portion of a larger nucleic acid molecule, for 
example, a portion of a gene, which can contain a polymorphic region, or a 
portion of an extragenic region of a chromosome, for example, a portion of a 
30 region of nucleotide repeats such as a short tandem repeat (STR) locus, a 
variable number of tandem repeats (VNTR) locus, a microsatellite locus or a 
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minisatellite locus. A polynucleotide also can be single stranded or double 
stranded, including, for example, a DNA-RNA hybrid, or can be triple stranded or 
four stranded. Where the polynucleotide is double stranded DNA, it can be in an 
A, B, L or Z configuration, and a single polynucleotide can contain combinations 
5 of such configurations. 

As used herein, "oligonucleotide," "polynucleotide" and "nucleic acid" 
include linear oligomers of natural or modified monomers or linkages, including 
deoxyribonucleosides, ribonucleotides, or-anomeric forms thereof capable of 
specifically binding to a target gene by way of a regular pattern of monomer-to- 

10 monomer interactions, such as Watson-Crick type of base pairing, base stacking, 
Hoogsteen or reverse Hoogsteen types of base pairing. Monomers are typically 
linked by phosphodiester bonds or analogs thereof to form the oligonucleotides. 
Whenever an oligonucleotide is represented by a sequence of letters, such as 
"ATGCCTG," it is understood that the nucleotides are in a 5'-> 3' order from 

15 left to right. 

Typically oligonucleotides for hybridization include the four natural 
nucleotides; however, they also can include non-natural nucleotide analogs, 
derivatized forms or mimetics. Analogs of phosphodiester linkages include 
phosphorothioate, phosphorodithioate, phosphorandilidate, phosphoramidate, for 

20 example. A particular example of a mimetic is protein nucleic acid (see, e.g., 
Egholm et ai (1993) Nature 
355:566; see also U.S. Patent No. 5,539,083). 

As used herein, the term "polypeptide," means at least two amino acids, 
or amino acid derivatives, including mass modified amino acids and amino acid 

25 analogs, that are linked by a peptide bond, which can be a modified peptide 

bond. A polypeptide can be translated from a polynucleotide, which can include 
at least a portion of a coding sequence, or a portion of a nucleotide sequence 
that is not naturally translated due, for example, to its location in a reading frame 
other than a coding frame, or its location in an intron sequence, a 3' or 5' 

30 untranslated sequence, a regulatory sequence such as a promoter. A 

polypeptide also can be chemically synthesized and can be modified by chemical 
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or enzymatic methods following translation or chemical synthesis. The terms 
" polypeptide, " "peptide" and "protein" are used essentially synonymously 
herein, although the skilled artisan recognizes that peptides generally contain 
fewer than about fifty to one hundred amino acid residues, and that proteins 
5 often are obtained from a natural source and can contain, for example, post- 
translational modifications. A polypeptide can be post-translationally modified 
by, for example, phosphorylation (phosphoproteins), glycosylation 
(glycoproteins, proteoglycans), which can be performed in a cell or in a reaction 
in vitro. 

10 As used herein, the term "conjugated" refers stable attachment, typically 

by virtue of a chemical interaction, including ionic and/or covalent attachment. 
Among conjugation means are: streptavidin- or avidin- to biotin interaction; 
hydrophobic interaction; magnetic interaction ( e.g. , using functionalized 
magnetic beads, such as DYNABEADS, which are streptavidin-coated magnetic 

15 beads sold by Dynal, Inc. Great Neck, NY and Oslo Norway); polar interactions, 
such as "wetting" associations between two polar surfaces or between 
oligo/polyethylene glycol; formation of a covalent bond, such as an amide bond, 
disulfide bond, thioether bond, or via crosslinking agents; and via an acid-labile 
or photocleavable linker. 

20 As used herein, a composition refers to any mixture. It can be a 

solution, a suspension, liquid, powder, a paste, aqueous, non-aqueous or any 
combination thereof. 

As used herein, a combination refers to any association between among 
two or more items. The combination can be two or more separate items, such as 

25 two compositions or two collections, can be a mixture thereof, such as a single 
mixture of the two or more items, or any variation thereof. 

As used herein, fluid refers to any composition that can flow. Fluids thus 
encompass compositions that are in the form of semi-solids, pastes, solutions, 
aqueous mixtures, gels, lotions, creams and other such compositions. 

30 As used herein, the term "solid support" means a non-gaseous, non-liquid 

material having a surface. Thus, a solid support can be a flat surface 
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constructed, for example, of glass, silicon, metal, plastic or a composite; or can 
be in the form of a bead such as a silica gel, a controlled pore glass, a magnetic 
or cellulose bead; or can be a pin, including an array of pins suitable for 
combinatorial synthesis or analysis. 
5 As used herein, a collection contains two, generally three, or more 

elements. 

As used herein, an array refers to a collection of elements, such as cells 
and nucleic acid molecules, containing three or more members; arrays can be in 
solid phase or liquid phase. An addressable array or collection is one in which 

10 each member of the collection is identifiable typically by position on a solid 

phase support or by virtue of an identifiable or detectable label, such as by color, 
fluorescence, electronic signal (i.e. RF, microwave or other frequency that does 
not substantially alter the interaction of the molecules of interest), bar code or 
other symbology, chemical or other such label. Hence, in general the members 

15 of the array are immobilized to discrete identifiable loci on the surface of a solid 
phase or directly or indirectly linked to or otherwise associated with the 
identifiable label, such as affixed to a microsphere or other particulate support 
(herein referred to as beads) and suspended in solution or spread out on a 
surface. The collection can be in the liquid phase if other discrete identifiers, 

20 such as chemical, electronic, colored, fluorescent or other tags are included. 

As used herein, a substrate (also referred to as a matrix support, a matrix, 
an insoluble support, a support or a solid support) refers to any solid or semisolid 
or insoluble support to which a molecule' of interest, typically a biological 
molecule, organic molecule or biospecific ligand is linked or contacted. A 

25 substrate or support refers to any insoluble material or matrix that is used either 
directly or following suitable derivatization, as a solid support for chemical 
synthesis, assays and other such processes. Substrates contemplated herein 
include, for example, silicon substrates or siliconized substrates that are 
optionally derivatized on the surface intended for linkage of antMigands and 

30 ligands and other macromolecules. Other substrates are those on which cells 
adhere. 
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Such materials include any materials that are used as affinity matrices or 
supports for chemical and biological molecule syntheses and analyses, such as, 
but are not limited to: polystyrene, polycarbonate, polypropylene, nylon, glass, 
dextran, chitin, sand, pumice, agarose, polysaccharides, dendrimers, buckyballs, 
5 polyacrylamide, silicon, rubber, and other materials used as supports for solid 
phase syntheses, affinity separations and purifications, hybridization reactions, 
immunoassays and other such applications. 

Thus, a substrate, support or matrix refers to any solid or semisolid or 
insoluble support on which the molecule of interest, typically a biological 

10 molecule, macromolecule, organic molecule or biospecific ligand or cell is linked 
or contacted. Typically a matrix is a substrate material having a rigid or 
semi-rigid surface, in many embodiments, at least one surface of the substrate is 
substantially flat or is a well, although in some embodiments it can be desirable 
to physically separate synthesis regions for different polymers with, for example, 

15 wells, raised regions, etched trenches, or other such topology. Matrix materials 
include any materials that are used as affinity matrices or supports for chemical 
and biological molecule syntheses and analyses, such as, but are not limited to: 
polystyrene, polycarbonate, polypropylene, nylon, glass, dextran, chitin, sand, 
pumice, polytetrafluoroethylene, agarose, polysaccharides, dendrimers, 

20 buckyballs, polyacrylamide, Kieseiguhr-polyacrlamide non-covalent composite, 
polystyrene-polyacrylamide covalent composite, polystyrene-PEG 
(polyethyleneglycol) composite, silicon, rubber, and other materials used as 
supports for solid phase syntheses, affinity separations and purifications, 
hybridization reactions, immunoassays and other such applications. 

25 The substrate, support or matrix herein can be particulate or can be a be 

in the form of a continuous surface, such as a microtiter dish or well, a glass 
slide, a silicon chip, a nitrocellulose sheet, nylon mesh, or other such materials. 
When particulate, typically the particles have at least one dimension in the 
5-10 mm range or smaller. Such particles, referred collectively herein as 

30 "beads", are often, but not necessarily, spherical. Such reference, however, 
does not constrain the geometry of the matrix, which can be any shape, 
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including random shapes, needles, fibers, and elongated. Roughly spherical 
"beads", particularly microspheres that can be used in the liquid phase, are also 
contemplated. The "beads" can include additional components, such as 
magnetic or paramagnetic particles (see, e.g., Dyna beads (Dynal, Oslo, 
5 Norway)) for separation using magnets, as long as the additional components do 
not interfere with the methods and analyses herein. For the collections of cells, 
the substrate should be selected so that it is addressable (i.e., identifiable) and 
such that the cells are linked, absorbed, adsorboed or otherwise retained 
thereon. 

10 As used herein, matrix or support particles refers to matrix materials that 

are in the form of discrete particles. The particles have any shape and 
dimensions, but typically have at least one dimension that is 100 mm or less, 50 
mm or less, 10 mm or less, 1 mm or less, 100//m or less, 50 fjm or less and 
typically have a size that is 100 mm 3 or less, 50 mm 3 or less, 10 mm 3 or less, 

15 and 1 mm 3 or less, 100//m 3 or less and can be order of cubic microns. Such 
particles are collectively called "beads." 

As used herein, high density arrays refer to arrays that contain 384 or 
more, including 1536 or more or any multiple of 96 or other selected base, loci 
per support, which is typically about the size of a standard 96 well microtiter 

20 plate. Each such array is typically, although not necessarily, standardized to be 
the size of a 96 well microtiter plate. It is understood that other numbers of 
loci, such as 10, 100, 200, 300, 400, 500, 10", wherein n is any number from 
0 and up to 10 or more. Ninety-six is merely an exemplary number. For 
addressable collections that are homogeneous (i.e. not affixed to a solid 

25 support), the numbers of members are generally greater. Such collections can 
be labeled chemically, electronically (such as with radio-frequency, microwave or 
other detectable electromagnetic frequency that does not substantially interfere 
with a selected assay or biological interaction). 

As used herein, the attachment layer refers the surface of the chip device 

30 to which molecules are linked. A chip can be a silicon semiconductor device, 

which is coated on a least a portion of the surface to render it suitable for linking 
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molecu.es and inert to any reactions to which the device is exposed. Molecules 
are Imked either directiy or indirectly to the surface, linkage can be effected by 
absorption or adsorption, through cova.ent bonds, ionic interactions or any other 
.nteraction. Where necessary the attachment layer is adapted, such as by 
5 derivatization for linking the molecules. 

As used herein, a gene chip, also called a genome chip and a microarray 
refers to high density oligonuc.eotide-based arrays. Such chips typically refer to 
arrays of o.igonuc.eotides for designed monitoring an entire genome, but can be 
■ designed to monitor a subset thereof. Gene chips contain arrayed polynucleotide 
10 cha.ns (ol.gonucleotides of DNA or RNA or nucleic acid analogs or combinations 
thereof) that are single-stranded, or at least partially or completely single- 
stranded prior to hybridization. The o.igonuc.eotides are designed to specifically 
and generally uniquely hybridize to particular polynucleotides in a population 
whereby by virtue of formation of a hybrid the presence of a po.ynuc.eotide in a 
15 population can be identified. Gene chips are commercially avai.ab,e or can be 
prepared. Exemplary microarrays include the Affymetrix GeneChip* arrays Such 
arrays are typical.y fabricated by high speed robotics on glass, nylon or other 
su,tab.e substrate, and include a plurality of probes (oligonucleotides) of known 
•dentity defined by their address in (or on, the array (an addressable locus, The 
20 oligonucleotides are used to determine complementary binding and to thereby 
prov.de parallel gene expression and gene discovery in a sample containing 
target nucleic acid mo.ecu.es. Thus, as used herein, a gene chip refers to an 
addressab.e array, typically a two-dimensional array, that includes p.ura.ity of 
oligonucleotides associate with addressab.e loci "addresses", such as on a 
25 surface of a microliter plate or other solid support. 

As used herein, a plurality of genes includes at least two, five 10 25 
50, 100, 250. 500. 1000, 2.500. 5.000, 10.000. 100.000, 1.000,000 or more 
genes. A p.ura.ity of genes can include comp.ete or partial genomes of an 
organs or even a p.ura.ity thereof. Selecting the organism type determines the 
30 genome from among which the gene regulatory regions are selected. Exemplary 
organ,sms for gene screening include animals, such as mammals, including 
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human and rodent, such as mouse, insects, yeast, bacteria, parasites, and 
plants. 

As used herein, the term "target site" refers to a specific locus on a solid 
support upon which material, such as matrix material, matrix material with 
5 sample, and sample, can be deposited and retained. A solid support contains 
one or more target sites, which can be arranged randomly or in ordered array or 
other pattern. When used for mass spectrometric analyses, such as MALDI 
analyses, a target site or the resulting site with deposited material, can be equal 
to or less than the size of the laser spot that is focussed on the substrate to 

10 effect desorption. Thus, a target site can be, for example, a well or pit, a pin or 
bead, or a physical barrier that is positioned on a surface of the solid support, or 
combinations thereof such as, but are not limited to, beads on a chip and chips 
in wells. A target site can be physically placed onto the support, can be etched 
on a surface of the support, can be a "tower" that remains following etching 

15 around a locus, or can be defined by physico-chemical parameters such as 
relative hydrophilicity, hydrophobicity, or any other surface chemistry that 
retains a liquid therein or thereon. A solid support can have a single target site, 
or can contain a number of target sites, which can be the same or different, and 
where the solid support contains more than one target site, the target sites can 

20 be arranged in any pattern, including, for example, an array, in which the 
location of each target site is defined. 

As used herein, the term "liquid dispensing system" means a device that 
can transfer a predetermined amount of liquid to a target site. The amount of 
liquid dispensed and the rate at which the liquid dispensing system dispenses the 

25 liquid to a target site. 

As used herein, the term "liquid" is used broadly to mean a non-solid, 
non-gaseous material, which can be homogeneous or heterogeneous, and can 
contain one or more solid or gaseous materials dissolved or suspended therein. 

As used herein, the term "reaction mixture" refers to any solution in 
30 which a chemical, physical or biological change is effected. In general, a change 
to a molecule is effected, although changes to cells also are contemplated. . A 



WO 02/086794 



PCT/US02/12903 



-20- 

reaction mixture can contain a solvent, which provides, in part, appropriate 
conditions for the change to be effected, and a substrate, upon which the 
change is effected. A reaction mixture also can contain various reagents, 
including buffers, salts, and metal cof actors, and can contain reagents specific 
5 to a reaction, for example, enzymes, nucleoside triphosphates and amino acids. 
For convenience, reference is made herein generally to a "component" of a 
reaction, wherein the component can be a cell or molecule present in a reaction 
mixture, including, for example, a biopolymer or a product thereof. 

As used herein, submicroliter volume, refers to a volume conveniently 
10 measured in nanoliters or smaller and encompasses, for example, about 

500 nanoliters or less, or 50 nanoliters or less or 10 nanoliters or less, or can be 
measured in picoliters, for example, about 500 picoliters or less or about 50 
picoliters or less. For convenience of discussion, the term "submicroliter" is 
used herein to refer to a reaction volume less than about one microliter, although 
15 it is apparent to those in the art that the systems and methods disclosed herein 
are applicable to subnanoliter reaction volumes, such as picovolumes, as well. 

As used herein, nanoliter volumes generally refer to volumes between 
about 1 nanoliter up to less than about 100, generally about 50 or 10 nanoliters. 
As used herein, with respect to the supports, an element is defined as 
20 less hydrophobic than another by the relative "wettability" of the element or 
contact angles, where the contact angle of an element is less than the 
surrounding surface. The contact angle is the angle the breaks the surface 
tension when a liquid is delivered. A hydrophilic substrate requires a relatively 
lower contact angle than a more hydrophobic material. Hence contact angle 
25 refers to relative hydrophobicity between or among surfaces. Hence loci on 
supports can be defined by their relative hydrophobicity/hydrophilicity to 
surrounding areas. 

As used herein, high-throughput screening (HTS) refers to processes that 
test a large number of samples, such as samples of test proteins or cells 
30 containing nucleic acids encoding the proteins of interest to identify structures of 
interest or the identify test compounds that interact with the variant proteins or 
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cells containing them. HTS operations are amenable to automation and are 
typically computerized to handle sample preparation, assay procedures and the 
subsequent processing of large volumes of data. 

As used herein, symbology refers to a code, such as a bar code or other 
5 symbol, that is engraved, stamped or imprinted on a substrate. The symbology 
is any code known or designed by the user. In general, the symbols are 
identifiable to the user or are associated with information stored in a computer or 
memory and associated with identifying information. 

As used herein, phenotype refers to the physical or other manifestation of 
10 a genotype (a sequence of a gene). 

As used herein, the abbreviations for amino acids and protective groups 
and other abbreviations are in accord with their common usage and, if 
appropriate, the IUPAC-IUB Commission on Biochemical Nomenclature (see, 
(1972) Biochem. 11: 942-944). 

As used herein, the amino acids, which occur in the various amino acid 
sequences appearing herein, are identified according to their known, three-letter 
or one-letter abbreviations. The nucleotides, which occur in the various nucleic 
acid fragments, are designated with the standard single-letter designations used 
routinely in the art. 

It should be noted that any amino acid residue sequences represented 
herein by formulae have a left to right orientation in the conventional direction of 
amino-terminus to carboxyl-terminus. In addition, the phrase "amino acid 
residue" includes the amino.acids listed in the Table of Correspondence and 
modified and unusual amino acids, such as those referred to in 37 C.F.R. § § 
25 1 .821-1 .822, and incorporated herein by reference. Furthermore, it should be 
noted that a dash at the beginning or end of an amino acid residue sequence 
indicates a peptide bond to a further sequence of one or more amino acid 
residues or to an amino-terminal group such as NH 2 or to a carboxyl-terminal 
group such as COOH. 

As used herein, amplifying refers to means for increasing the amount of a 
biopolymer, especially nucleic acids. Based on the 5' and 3' primers that are . 
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chosen, amplification also serves to restrict and define the region of the genome 
that is subject to analysis. Amplification can be by any means known to those 
skilled in the art, including use of the polymerase chain reaction (PCR) and other 
amplification protocols, such as ligase chain reaction, RNA replication, such as 
5 the autocatalytic replication catalyzed by, for example, Qfi replicase. 

Amplification is done quantitatively when the frequency of a polymorphism is 
determined. 

B. Systems and methods 

Systems that contain a data acquisition instrument, such as a mass 

10 spectrometer, an NMR instrument, a gas chromatograph, combinations thereof 
and other such instruments, that acquires data from a sample, such as a 
biological sample, and processors for assay-based judging decisions are 
provided. The processor contains programs (routines) for data collection and 
data analysis that are integrated, such as by a computer-based calling 

15 component, so that the instrument can provide real time (RT) diagnostic or other 
data typing results from samples tested. The data collection routines control the 
instrument for data collection; and the data analysis process controls the data 
collection process, for example, to determine a need for further sampling. The 
results from the data analysis are fed back (integrated) to the instrument and 

20 control thereof to assess the need for more sampling or testing. 

Commercial MALDI mass spectrometers typically perform automated 
measurements on a series of samples. Software packages for automation 
include integrated algorithms that are used to judge the quality of the spectra. 
Such algorithms assess parameters such as the signal-to-noise ratio, peak 

25 resolution, and/or signal intensity within a specified mass range. If an acquired 
spectrum is determined to be of low quality, the instrument parameters can be 
adjusted and/or the stage can be moved ("rastered") to another section of the 
sample for re-acquisition of the spectrum. The cycle of evaluation and re- 
acquisition is repeated until either a spectrum of sufficient quality is acquired or 

30 a pre-specified number of acquisition attempts have been made. The spectrum is 
then saved and the system moves on to the next sample. 
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in these systems, the integrated judging algorithms make determinations 
based on qualities of the spectra that are independent of the underlying assay or 
biological information contained in the spectra. During an automated run, a 
spectrum for each sample is stored. Then special purpose algorithms are 
5 employed that automatically determine the sample genotype. For example, the 
SPECTROTYPER mass spectrometry system (Sequenom, Inc., San Diego, CA; 
see, also U.S. application Serial No. 09/285,481 filed 4/5/99, published as U.S. 
application Publication No. US-2002-0009394-A1; International PCT application 
No. WO 02/25567) is an automated data processing system that among other 
10 determinations, determines one or more genotypes in each sample depending on 
the assay definition for that sample and assigns each a quality, generally from 
best to worst and/or conservative, moderate, aggressive, low probability or bad 
spectrum. 

In the above-described systems, a combination of automated data 

15 collection routines and automated data processing routines, two different sets of 
criteria are used to judge the spectra; one set of criteria is used to control the 
data acquisition process and a separate set is used to determine the biological 
significance of the acquired spectrum. Using such two-step acquisition and 
analysis routines, however, can result in missed calls and unnecessarily long 

20 acquisition times. This is because the spectral features that define a "clean" 
acquisition are not necessarily the same features required for, for example, 
accurate genotyping. For example, the presence of large primer peaks due to 
incomplete extension can render a spectrum acceptable in terms of signal to 
noise criteria in a predefined mass window, but the resulting spectrum might not 

25 be of sufficient quality to allow determination of an unambiguous genotype. It is 
also possible that a spectrum that is of high quality for genotyping has a signal- 
to-noise ratio that causes repeated sampling by the data collection algorithm. In 
this case, unneeded data are collected with a corresponding decrease in 
throughput. When different criteria are used for data collection and for data 

30 analysis it is always possible that either the data collected do not give a suitable 
biological result or that extra data are collected resulting in lower throughput. 
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Furthermore, the mismatches between the two judging methods become more 
common as the spectra from a sample become more complex, as with highly 
multiplexed samples. 

The methods and systems herein provide integration of the algorithms for 
5 data analysis and data collection and result in faster, more accurate MALDI 

analyses, such as genotyping. An exemplary system provided herein, which is a 
modification of systems, such as the SpectroTyper" system, includes highly 
optimized versions of the calling algorithms with a streamlined interface to a 
database to store the results of analyses, such as genotyping results. As part of 

10 the optimization, a well defined programming interface that controls the dialogue 
between the data acquisition component and the biological-calling component of 
spectra analysis is provided. The interface is flexible and modular to allow 
modification of the calling algorithms. 

With the systems and method provided herein, the biological results, such 

15 as a genotype, guide data acquisition decisions, not instrument output such as a 
mass spectrum;. The systems herein consider the ultimate biological results, 
not the output of the instrument, and determines whether the results are good 
enough. The system directs the instrument to obtain further data. In some 
embodiments, the system can eliminate further processing steps in a particular 

20 assay if it repeatedly fails. The results of the assays, not necessarily the 
instrument's results, are displayed, and can be displayed in real-time. 

The instruments as provided herein display, not (or not only) the direct 
instrument output, such as a mass spectrum or spectra, the desired result, 
including a diagnosis, such as a genotypes or allelic frequency. Hence, for 

25 example, a mass spectrometer that includes a display biological result of a 
diagnostic test, such as the genotype or diagnosis or allelic frequency, is 
provided. 

The methods and systems provided herein permit multiplex analyses 
including analyses of multiple reactions in a sample, and multiple sample 
30 analyses. They also permit real time analysis and output of diagnostic tests that 
require analysis and identification of a plurality of markers. The processes herein 
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permit the each assay to be considered and can assess each assay in a 
multiplexed reaction and determine the results from each assay. For example, 
there are about twenty markers associated with cystic fibrosis (CF). For a 
clinical test for CF that analyzes all twenty markers, it may not be possible to 
5 perform all twenty in one multiplex reaction. Multiple samples could be required. 
The methods and systems provided herein permit this to be done and the results 
from different samples combined if needed. Using the algorithms, systems and 
methods provided herein, results from several samples are obtained as are the 
results over multiple samples. Hence the methods and systems provided herein 

10 connect a plurality of related measurements, such as measurements that have 
related biological meaning. The ultimate output is a diagnosis, such as a 
genotype, which can be derived from the results of tests of a plurality of 
samples. For the output, it does not matter whether there is one mass 
spectrum or a plurality thereof. 

1 5 A potential problem with a system that runs biology based signal 

processing in real-time is throughput. The biology based algorithms can take a 
significant amount of time to run. The hardware and software provided herein 
solve these problems and permit biology-based instrument control and high 
throughput analyses. 

20 The assay based judging and machine control described herein has been 

used to run many high density arrays, such as 384 and 96 position chips, using 
a wide range of assays and nucleic acid-containing samples and can be used for 
higher density formats, such as but are not limited to, 1534 and higher. The 
improvement in calling efficiency has been observed to range from 0% to over 

25 50%. The degree of improvement depends on the quality of the assay and the 
level of multiplexing 

For example, a typical experiment involved forty-eight assay, four-plex 
tests that were performed on eight different DNA groups (384 reactions) and 
were deposited on a 384-spot chip (see, EXAMPLES, FIG. 9 and FIG. 10). The 
30 same chip was measured consecutively on three mass spectrometer instruments. 
The first run was performed without the data analysis application software. The 
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"standard" configuration instrument uses fuzzy logic to control rastering based 
on resolution and signal-to-noise ratio over a fixed mass range. Thus, the 
biological result (e.g. genotype) is not used in controlling the instrument. The 
next two runs were performed using the data analysis biological-results control 
5 described above and as provided herein. The data results are presented below in 
EXAMPLE 3 (Table 1 >. It should be noted that the quality of data is expected to 
decrease from consecutive data runs, because the sample is depleted by 
successive laser shots, and thus call efficiency should decrease. The results in 
Table 1 (see EXAMPLE 3) show that the call efficiency was improved by using 
10 assay-based judging in accord with the methods and systems provided herein to 
control data acquisition. In particular, overall call efficiency was improved from 
77% for the "standard" configuration to 90.9% in the first data run using the 
assay-based judging. 

As exemplified, the assay-based judging provided herein added shots 

15 from different raster positions until all assays provided acceptable results, up to 
the shot limit per sample. Thus, the assay-based judging is not misled by large 
primer peaks in the spectrum output or by large peaks that come from assays 
other than the assay of interest. In the spectral output, for example, the peaks 
at mass 6261 and at mass 6574 can be compared between FIG. 9 and FIG. 10 

20 (see, EXAMPLE 3). It is known that these peaks represent the C and A alleles of 
one of the assays in the sample. From the FIG. 10 spectrum, it is clear that the 
assay should be called CA, and the system using the assay-based judging 
provided herein made a CA call with a conservative score. Viewing the FIG. 9 
spectrum, it is less clear what the call should be. It can be seen that the peak at 

25 mass 6261 is especially noisy. In such a case, averaging in more shots from a 
different section of the sample would help, but the "standard" configuration 
judging is misled by good signal-to-noise peaks in the spectrum so that the 
spectrum is judged sufficient and the system proceeds to the next sample 
without acquiring additional data. The benefits of assay-based judging and data 

30 processing as provided herein are advantageously realized in multiplexed assays. 
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C Description of Exemplary Embodiments 
1 . Testing system 

Referring now to FIG. 1. a diagrammatic representation of an exemplary 
testing system 10 for testing a biological sample. Generally, the 

5 testing system 10 contains a real time <RT) workstation 1 2, which includes a 
series of controllers that retrieve assay design parameters from a database 
maintained by a database server 13 and directs the acquisition and processing of 
data indicative of the biological sample from a mass spectrometer 14. The 
processed data or genotyping results are then downloaded into a directory at the 
10 database server 1 3. With the testing system 1 0 generally disclosed, 

exemplary individual components are as follows. The testing system 10 has a RT 
workstation 12 that can be, for example, a computer system having storage and 
computational components, including one or more controllers. In one 
embodiment, the RT workstation includes an assay controller 30 (also referred to 
as a plate editor) that acquires assay design specifications from the database 
server 13, includes a data acquisition controller 31 which automatically aligns 
the laser on a chip using an image system, controls the motor movement of the 
assay substrate at the mass spectrometer, and acquires the data signal directly 
from the mass spectrometer, and also includes a real-time data analysis 
controller 32 that communicates with the controller 31 by receiving a data signal 
and providing instruction for additional data acquisition. Additional data 
acquisition can be dependent on the quality of the data previously obtained. As 
described further below, the data quality can be assessed with respect to assay 
results, such as whether a determination about the spectra results can be made. 
25 The data can be stored on a local hard drive of the RT workstation 1 2 until the 
results from all the samples are compiled. The compiled data is stored in a 
directory in the database server 13. The RT workstation 12 can include a 
display 16 for visually communicating test results and status information. 

In one embodiment, the RT workstation 12 is a computer, such as an 
IBM-compatible Personal Computer system, communicating with the mass 
spectrometer using a known communication standard, such as a parallel or serial 
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interface. It can be appreciated that the workstation and controllers can be 
alternatively embodied. For example, the RT workstation 1 2 can be integral to 
the mass spectrometer 1 4 or another system component, or the workstation 1 2 
and controllers 30, 31, 32 can be placed at a remote location from the mass 
5 spectrometer. In such a manner the network topography, such as a wide area 
network or a local area network, would provide a communication path between 
the mass spectrometer 14 and the RT workstation 12. Although the RT 
workstation 1 2 can be standalone computer device; it can be appreciated that 
one or more of the controllers 30/31, 32 can be, for example, a microprocessor 

10 or other programmable circuit device capable of performing a programmed 

process. Moreover, it can be appreciated that the workstation 1 2, one or more 
of the controllers 30, 31, 32, and the database server 13 can be integrated into 
a single device, or can be separate, independently operating devices. 

The mass spectrometer 14 can be a MALDI Time-of-Fiight (TOF) 

15 instrument, which are known to those of skill in the art (see, e.g., such a co- 
pending U.S. patent application serial number 09/663,968 filed 9/19/00 and 
entitled "SNP Detection Method", and U.S. patent application serial number 
09/285,481 filed 4/5/99 and entitled "Automated Process Line", and published 
as U.S. application Publication No. US-2002-0009394-A1). The mass 

20 spectrometer 14 is configured with an interface to communicate with the 

workstation controller 1 2. The interface can be designed to conform to a known 
data communication standard, for ease of connection. Although a single 
interface can enable the controller 12 to both receive data from the mass 
spectrometer 1 4 and send instructions to the mass spectrometer 1 4, two or 

25 more separate interfaces can be used. Although the exemplary test system 10 
incorporates a MALDI TOF mass spectrometer, it can be appreciated that other 
types of analytical instruments and mass spectrometers can be used. 

The testing system 10 can provide the database server 13 with one or 
more databases, such as database 18, database 19, database 20, database 21 

30 and database 22 stored in direct access storage devices. Such databases can 
store assay design, genotype profiles, allelotype profiles, mass spectra and other 
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such data or instruction sets. It can be appreciated that other forms of data 
storage can be used. A structured database provides a convenient format for 
storing and retrieving data. In an exemplary embodiment, one of the databases, 
such as database 1 8, stores assay design information, a database 1 9 stores 
5 genotyping profiles, a database 20 stores allelotyping profiles, database 21 

stores sample identification information, while the other database 22 stores test 
results for later analysis. It can be appreciated that fewer or more databases can 
be used to store assay and test information. If desired, the databases can be 
distributed between the workstation 12 and the database server 13. The 
10 database server 1 3 can also contain one or more controllers such as controller 
23 and controller 24. In an exemplary embodiment, the controller 23 monitors 
the data acquisition of the individual samples on the assay substrate or chip. 
Once the data are received from all samples in the assay, the data monitoring 
controller 23 downloads all or part of the assay information and stores the 
15 information in a directory in the test results database 22. The controller 24 
imports the data into a directory in the results database 22. 

The RT workstation 1 2 has sufficient processing ability to extract assay 
design information from the assay design database 18, and to convert the assay 
design information into a format for providing specific directions to the mass 
20 spectrometer 14. For example, the controller can access the database 18 and 
request a specific assay design. The specific assay can be set up to provide a 
microtiter plate with hundreds, or even thousands, of samples on each plate. 
The test can require that samples be tested in a specific order, and based upon 
the result from previous tests, the order can be adjusted, or some samples can 
25 even be eliminated from the assay. The RT workstation receives the assay 

design information and converts the assay design information into commands for 
the mass spectrometer 14. Upon starting the assay, the RT workstation 12 
sends initialization commands to the mass spectrometer 14 consistent with the 
assay design. 

30 Extracting an assay design from a database and generating mass 

spectrometer commands can be a time consuming and processor intensive 
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operation. It would be particularly undesirable for the extraction process to 
interfere with the more real-time control of the mass spectrometer. Accordingly, 
the RT workstation 1 2 can be designed to perform a database extraction 
process, and database storage functions, as background tasks, or at a time 
5 when such tasks do not interfere materially with the more real-time control of 
the mass spectrometer 14. The RT workstation 12 defines a physical 

map of the biological samples on the assay plate or chip by manual input of 
information by the operator or an automated scanning system such as an optical 
reader other such reader to read bar codes or other symbologies, where a 

10 symbology, such as bar code information, identifies the plate or chip. 

A mass spectrometer 14 receives the biological sample for analysis and 
generates an electrical data signal representative of information, such as a 
genotype, associated with the sample tested under direction from the real time 
workstation 1 2. The instrument is initialized when it is provided with specific 

15 data acquisition parameters, either manually or in a default mode. The 
acquisition parameters can include the number of laser shots per spot, the 
maximum number of raster iterations per sample, and voltage, delay time, 
calibration constants and other parameters that are well-known to those skilled 
in the art. The mass spectrometer is initialized according to test assay 

20 parameters, and acquires data indicative of the biological samples. More 

particularly, the data acquired by the mass spectrometer is typically in the form 
of an electronic data spectrum. The electronic data spectrum can be retrieved 
by the RT workstation. 

Biological samples are analyzed when the RT workstation 1 2 directs the 

25 automatic alignment of the mass spectrometer laser onto an assay surface or 
chip using an imaging system and controls movement of the laser from sample 
to sample, and from assay surface to assay surface when multiple assay 
surfaces or chips are held in a multi-component support. 

Biological information, such as genotyping information, is acquired 

30 directly from the mass spectrometer 14 by the RT workstation 12. The signal is 
converted into a mass data spectrum by the RT workstation 12 where a 
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genotype is determined. If the sample information, such as the genotype, 
cannot be called, the RT workstation 12 recognizes the situation and directs an 
adjustment to the mass spectrometer 14. For example, if the acquired spectrum 
has an unacceptably low signal to noise ratio, the workstation controller 12 can 
5 direct the mass spectrometer 1 4 to test the same sample again, but can adjust 
the mass spectrometer 14 to direct its beam at a different spot on the sample, or 
can select alternative power settings or measurement filters. In another 
example, the controller 12 can direct the mass spectrometer 14 to take a series 
of data sets from the same sample until the standard deviation in the aggregate 

10 results achieves a desired degree of certainty. It should be understood that, 
even though the same sample can be tested multiple times, each test is taken 
from a unique spot on the sample. 

As noted, in accord with the methods and systems provided herein, the 
test criteria additionally involve the results of the assays performed. A system 

1 5 provided herein uses biology-based decision outcomes (such as conclusions 

about genotype of a sample) to control the operation of the test machine and to 
determine if repeated testing (rastering) is needed. 

The modular design of an exemplary embodiment, with a data acquisition 
component and data analysis component (FIG. 1), provides a great deal of 

20 flexibility. Each component can be modified in operation to develop special 

purpose operation and control for calling biological assays. Thus, the component 
operation can be modified to suit different assay types. Those skilled in the art 
understand that a wide variety of component interface configurations can be 
used to facilitate communications between the analysis software application 

25 components. 

2. RT Workstation 

FIG. 7 depicts the RT workstation of FIG. 1 . Components 206, 208, and 
21 0 are part of the RT workstation 1 2 in FIG. 1 ; components 204 and 202 make • 
up the mass spectrometer 14 in FIG. 1 . FIG. 7 is a block diagram of an 
30 exemplary testing system 200 provided herein. The of FIG. 1 using the RT 
station of and FIG. 7) and processing in accordance with FIGS. 2 and 4, 
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discussed below, provides efficient instrument operation. In the embodiment of 
the RT station depicted in FIG. 7, a mass spectrometer instrument 202 that is 
controlled by a mass spectrometer workstation 204 communicates with a 
computer analysis workstation 206 that operates in accordance with the 
description herein. Exemplary instruments, include a MALDI time-of-flight 
instrument, such as a "Biflex" mass spectrometer available from Bruker Daltonik 
GmbH of Bremen, Germany. The instrument controlling workstation 204 can 
include, for example, a Sun workstation, available from Sun Microsystems, Inc. 
of Santa Clara, California, USA. 

The workstation 206 can be configured as a Personal Computer (PC) 
equipped with a digitizer 208 and a frame grabber 210. The frame grabber 
receives video image data from a sample visualization camera (not illustrated) of 
the instrument that is part of the machine visualization system described above. 
The frame grabber can include the model IMAQ PXI-141 1 from National 
Instruments Corporation of Austin, Texas, USA. The digitizer 208 receives 
analog data and converts it to a digital representation. The digitizer can include, 
for example, a model "PDA500" 500-MHz, 8-bit digitizer from Signatec, Inc. of ' 
Corona, California, USA. 

In the system 200, four signals that ordinarily are routed between the 
instrument 202 and the instrument workstation 204 are instead routed between 
the instrument and the computer analysis workstation 206. The four signals are 
included in or form the output of the MCP detector of the instrument, the trigger 
control for the laser and high voltage electronics of the instrument, the output of 
a photodiode detector used to trigger data acquisition, and the video signal from 
the sample visualization camera. The video signal provides the image viewer 
display 136 (FIG. 6). The output of the MCP detector passes through a gain-five 
amplifier and a passive low-pass filter having a cutoff frequency of 90 MHz. The 
amplifier can include, for example, a Stanford Research Systems SR445 amplifier 
(Stanford Research Systems, Inc. of Sunnyvale, California, USA), and the low- 
pass filter can include, for example, a Mini Circuits BLP-90 filter (Mini-Circuits, 
Inc. of Brooklyn, New York, USA). 
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The instrument workstation 204 and the computer analysis workstation 
206, for example, can communicate with each other using a network connection 
protocol, such as the TCP/IP protocol. The communications are used by the 
computer analysis workstation 206 to move the sample handling stage of the 
5 instrument 202 from sample to sample of a plate or chip, and to move the stage 
so as to position each individual sample beneath the laser of the instrument to 
different raster positions. 

Thus, the computer analysis workstation 206 has the capability of 
acquiring spectra output, processing the output, and controlling the instrument 

10 202 according to the output in real time. The instrument output in terms of 
biology based results, such as genotype, are used to determine if a sample 
should be rastered (multiple data acquisition from the same sample). The 
workstation 206 executes a software application from which a user can specify 
a number of setup operating parameters to control decision making. The 

15 software application is installed at the workstation 206 into program memory 
(not illustrated). The installation can occur, for example, through magnetic 
media (floppy disks) or optical media (such as CD discs or DVD data discs) or 
can occur through network communications download. This permits the 
operation of the workstation to be easily modified through modifications to the 

20 application software. 

For example, in an exemplary embodiment, the analysis workstation 206 
is setup to run a series of samples through the instrument wherein, for each 
sample, a predetermined, set number of instrument activations (laser shots) are 
performed and output is averaged to create the spectra produced by the 

25 instrument. The number of shots per sample can be specified by a user for a 
sample run. The workstation collects the shot results, averages them, and then 
independently judges each assay that was defined or specified for that sample. 
If the result of judging (the assay score) for a sample is less than a "moderate" 
ranking, then the workstation 206 causes additional data collection (rastering) 

30 from the instrument for the assay in question. 
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Any additional data from a sample that is ordered by the workstation 206 
is averaged together and the data are added to the first data collected from the 
instrument 202. The result for the assay is determined, as before, and if the 
assay score is improved by the addition of the new data, then the new summed 
5 data collection is kept. The cycle of data collection and judging continues until a 
set number of shots or attempts has exceeded a predetermined limit number, or 
until a score achieving a "moderate" ranking or better is achieved for each assay 
in a sample. The operation of the system 200 is described in greater 

detail with respect to FIG. 4. 

10 3. Exemplary testing processes 

Referring now to FIG. 2, a method of testing a biological sample is 
shown. The exemplified method of testing first predefines spectrum criteria that 
predict the presence of a biological relationship in block 21 . Such criteria 
include, but are not limited to, allelic ratios, markers, signal strength, genotype 

15 calling method and associations. The predefined spectrum criteria vary 

depending on the assay to be run. For example, the spectrum criteria can be set 
to assure a minimum allelic ratio is exceeded. In this regard, the spectrum 
criteria can be set to reject acquired data where the allelic ratio is below a 
threshold, such as 5%. In another example, the presence of specific markers 

20 can be required to validate acquired data. In another example, the spectrum 
criteria can require that a peak exceed a signal to noise figure before accepting 
the acquired data as valid. Further, statistical methods can be applied to the 
acquired data, or sets of acquired data, to determine if a particular peak is 
statistically significant. Using such a statistical method can dramatically 

25 increase the accuracy of calling the composition of a biological sample (see, e.g., 
co-pending U.S. patent application serial number 09/663,968 filed 9/19/00 and 
entitled "SNP Detection Method", and U.S. patent application serial number 
09/285,481 filed 4/5/99 and entitled "Automated Process Line", and published 
as U.S. application Publication No. US-2002-0009394-A1 , which exemplify 

30 application of statistical methods acquired spectrum data). It can be appreciated 
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that the spectrum criteria can be defined in numerous ways consistent with the 
teachings herein. 

With the spectrum criteria predefined, block 22 shows that the assay 
design is defined, and then can be stored in a database for use in controlling the 
5 instrument. Exemplary assays include, but are not limited to, SNP 
detection/identification assays, diagnostic assays, biomolecule 
detection/identification assays and other assays, particularly assays for that 
involve detection and/or identification of a biomolecule or biopolymer. In an 
exemplary embodiment, the instrument is a MALDI TOF mass spectrometer. It 

10 can be appreciated that other instruments, including other mass spectrometers, 
can be substituted. The defined assay design is used to generate the initial 
settings for the instrument, and then is further used to direct the instrument 
during the assay test. 

Biological samples are then positioned in block 23 for test in the 

15 instrument as defined and required by the steps and protocol of a particular 
assay. The samples can be positioned on a support {holder}, such as a 
microtiter plate. It can be appreciated that other types of supports, including but 
are not limited to as test tubes or chips, can be substituted for a microtiter plate. 
Although it is more convenient to place all samples for one assay on a single 

20 support, samples for a single assay can be placed on multiple supports. 

The support is positioned in the instrument, as indicated in block 24. The 
support can be manually positioned, or can be positioned under robotic control. 
If the support is robotically controlled, then information extracted from the assay 
design can be used to direct the robotic control to place the proper support in 

25 the instrument. If manually positioned, a visual display can be used to assist the 
human operator in identifying and verifying the proper support. 

Blocks 25-28 represent the real time control of the instrument and are 
described further below. This real time control permits the automated and 
efficient operation of the instrument, and provides accuracies and repeatabilities 

30 in test results that are not available in known systems. 
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In block 25, the instrument acquires a data set from a biological sample. 
In an exemplary embodiment, the acquired data are in the form of an acquired 
data spectrum. In the exemplary system described in the '968 Application 
referenced above, the data set is generated by first finding the height of each 
5 peak, then extrapolating a noise profile, and finding noise of each peak, next 
calculating signal to noise ratio (s/n ratio), and finding residual error, and 
calculating and adjusting signal to noise ratio, and developing a probability, 
profile, and determining peak probabilities, and determining allelic penalty, and 
adjusting peak probability by allelip penalty, and calculating genotype 

10 probabilities, and testing ratio of genotype probabilities. 

The acquired data are evaluated in block 26. In an exemplary 
embodiment, the acquired data are compared against the spectrum criteria 
previously defined. As described above, this comparison can be, for example, a 
comparison of peak strength, peak position, markers, s/n ratio, allelic ratio, or a 

15 statistical calculation. Further, the comparison can be multi-dimensional, for 
example, requiring first that a particular marker be located and then testing that 
an appropriate signal to noise ratio exists. It can be appreciated that the 
comparison step 26 can use data from multiple acquired data sets, for example, 
to calculate the standard deviation for the group. Accordingly, the comparison 

20 compares the standard deviation in the group of data sets to determine if the 
results should be derived from the newly acquired data. 

Responsive to the comparison, the workstation controller adjusts the 
instrument in block 27. For example, if the signal to noise ratio was too low in a 
first data set, the instrument can be adjusted to test the same sample, but at a 

25 different spot on the sample. By moving to a new target spot, new data can be 
acquired for the same sample. This is referred to as rastering. In testing the 
new spot, it is quite possible that different or better analytical results can be 
found. Thus, taking a reading at a second spot can enable making an analytical 
call on a sample when it was not possible with only a single spot test. Further, 

30 testing additional spots on an individual sample can permit the calculation of 

aggregate results with a lower error rate than relying solely on a single test spot. 
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By automating the evaluation of the acquired data and control of the instrument, 
the overall assay test can be manipulated to provide a requisite level of accuracy 
and tolerance. Accordingly, the maximum number of samples can be accurately 
called for a particular assay, but yet time and system resources are not wasted 
5 by testing more spots than necessary. 

In particular, using assay results or outcome as a test criteria to control 
further sample testing has been found to increase the efficiency and throughput 
of the testing system by reducing unnecessary test cycles (rastering) and 
increasing the reliability of the test results. 
10 After the instrument is adjusted and set to acquire a next data set, the 

method returns to block 25 to acquire the next data set. As described above, 
the next data set can be for the same sample, or the instrument can have been 
adjusted to the next sample. After testing is completed, processing moves to 
block 28. 

15 Block 28 shows that the results from the acquired data are analyzed to 

determine the presence of an object biological relationship. For example, the 
assay can be attempting to locate particular single nucleotide polymorphisms 
(SNPs), or can be allele typing, or can be genotyping. Irrespective of the 
particular biological relationship searched for, the relative success of the search 

20 can be used by the FIG. 2 testing method in directing further data acquisitions. 
For example, if in a multiple sample assay, the biological relationship is ruled out 
after only the first sample, then the method can be directed to skip testing the 
rest of the samples in the assay and move on. In another example, if after 
testing multiple samples for a particular assay the results are still ambiguous, 

25 block 28 can be used to determine if the ambiguity can be removed by 

increasing the certainty of the results for a particular sample. If so, the test can 
be directed by the workstation to automatically take additional data acquisitions 
and attempt to salvage the assay. Without such an automated and intelligent 
process, the assay would be rejected. Accordingly, the FIG. 2 testing method 

30 provides a higher level of calls, and a higher level of call certainty than with 
known testing methods. 
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Referring now to FIG. 3, another method of testing a biological sample is 
shown. The FIG. 3 testing method 40 generally has an initialization loop 41 , a 
control loop 42, and a results loop 43. The control loop 42 is responsible for 
acquiring data sets, comparing the data sets to predefined spectrum criteria, and 
5 adjusting the instrument responsive to the evaluation of the acquired data. In 
this regard, the control loop must operate efficiently enough to permit the timely 
operation of the overall test system. Therefore, certain of the setup and storage 
functions have been off-loaded to the background loops 41 and 43. It can be 
appreciated that more or less functionality can be placed in the background 
10 loops 41, 43 to accommodate different response times needed in the control 
loop 42. 

The initialization loop 41 is a background loop that permits storage of 
assay design and plate information in block 44. Typically, the assay design and 
plate information is stored in a database form. Typically, the database of assay 
15 design and plate information can be used by multiple test systems, and can be 
accessed remotely. In such a manner a remote researcher can define an assay 
in a single database, and that newly defined assay can be operated on multiple 
test systems. 

Since extracting and converting the assay information into control 
20 information is a time consuming process, the extraction process is performed in 
block 45. Of course, it can be appreciated that as typical computer workstation 
computational powers increase, it can be desirable to have the extraction 
process made a part of the control loop 42. Since the extracting step is 
generally a background step, the extraction process can be performed for a next 
25 assay while the control loop 42 is actively performing an assay. Thus, when the 
control loop has finished an assay, the extracted information from block 45 can 
be sent to block 51 to start the control loop 42 for a next assay. 

The information from block 45 is received in block 51, where the 
information is used to initialize the instrument. In an exemplary embodiment, the 
30 instrument is a MALDI TOF mass spectrometer. The initialization commands can 
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include identifying the first sample to test, the proper power settings, and the 
desired filtering for the data. 

A sample is selected for test in block 52, and data are acquired from the 
test sample in block 53. The acquired data can be sufficiently processed to 
5 determine target characteristics for the acquired data. For example, if signal to 
noise ratio is an important indication of test quality, then a signal to noise ratio 
can be calculated for the acquired data. More particularly, the acquired data are 
processed to facilitate comparison with predefined spectrum criteria. 

The predefined spectrum criteria, as previously discussed, define the 

10 analytical characteristics for good data. In block 54, the acquired data are 

compared to the predefined spectrum criteria and is further processed in block 
54 to extract biological information. If the acquired data are good, a ''YES" 
outcome at block 54, and the data are formatted and displayed in block 58. If 
the acquired data are not good, however, a "NO" outcome at block 54, then 

15 block 55 asks if the maximum number of spots have been shot for this sample. 
Box 55 is check for the maximum number of rasters. For example, a typical 
mass spectrometer can take a maximum of about 1 5 to 20 shots on any given 
sample. To assure the integrity of the test, it can be advisable to set the 
maximum to a safe number, such as 20, or other number depending upon the 

20 sample and the instrument. The sample is not further processed if the maximum 
number of rasters has been exceeded (a "YES" outcome at block 55). At each 
raster position, 20 laser shots are measured and averaged to get a spectrum 
from that laser position. Thus, if less than 10 spots have been shot, a "NO" 
outcome at block 55, then the instrument is adjusted to a new spot in block 56, 

25 and data are acquired on the new spot in block 53. In block 54, the newly 

acquired data are compared to the spectrum criteria. Alternatively, block 54 can 
use aggregated data from multiple test spots to determine if the aggregated data 
are good. 

Once a sample has been judged to provide good or bad assay results, or 
30 if the maximum shots have been exceeded, then block 59 asks if there are more 
samples in the assay. If so, a "YES" outcome at block 59, then the instrument 
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is adjusted in block 61 to shoot the next sample. If all the samples have been 
tested, a "NO" outcome at block 59, then the control loop 42 resets and a next 
assay loop is initiated at block 51 . 

When the control loop 42 is complete, then the results from the assay are 
5 passed to the background results loop 43. The results loop 43 can perform 
additional post processing on the data in block 63, which can include a manual 
review of the results. The data and results can then be stored in block 65. 
Typically, the data and results are stored in a database that is accessible from 
remote locations so a remote researcher or other test operators can review the 

10 results. The loop repeats for all assay results. 

Referring now to FIG. 4, another testing method 70 is illustrated. The 
testing method 70 allows an assay designer to establish a minimum standard for 
each biological sample in block 71 . More particularly, the testing method 70 is 
directed to increasing the confidence in the results from each sample. As 

15 discussed above, a typical mass spectrometer can take a data set from multiple 
spots on a single biological sample. The testing method 70 enables the test to 
dramatically increase the confidence for each sample, while minimizing the 
number of testing samples that must be acquired. 

In the testing method 70, a biological sample is selected in block 72, and 

20 a data set is acquired in block 73. In block 74, the acquired data are evaluated 
against the data criteria set for the sample. For example, the data criteria can 
expect a signal to noise ratio to exceed a floor value. In this regard, each data 
set acquired for a particular sample is compared against the data criteria. 
Alternatively, data collected from multiple shots in the same sample can be used 

25 in the comparison. For example, the data criteria can require that the standard 
deviation between spots on the same sample not exceed a particular value. 
Thus the comparison step could include determining the standard deviation for 
all spots in the single sample to determine if confidence is sufficiently high to call 
the sample. It can be appreciated that the comparison step can entail a wide 

30 range of analytical and algorithmic calculations, either on individual data sets or 
aggregates of data sets. 
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Importantly, the testing method 70 permits setting the data criteria in a 
manner that minimizes the number of data acquisitions. For example, the data 
criteria could be to accept a sample when a single data set has a signal to noise 
ratio meeting one level, or meeting a lower level for aggregate data sets. Thus, 
5 a single strong reading would be sufficiently robust, and multiple shots would 
not be needed on that sample. In a similar manner, the comparison could be set 
to accept sample data if the standard deviation between two successive shots is 
less than 5%, or accept the data if the standard deviation is less than 7% for 3 
shots, or less than 10% for 4 or more shots. Such flexible data criteria permit 

10 the assay designer to set a high degree of confidence with a minimum of data 
readings. Accordingly, the test system 70 operates at high degree of efficiency 
and accuracy as compared to known systems. 

As noted above, the testing method also permits setting the data criteria 
to depend on assay test results or other biological-based criteria. In that 

15 circumstance, the comparison could be set to accept a sample if the results of 
an assay indicate that a particular genotype, for example, has a high probability 
(greater than 50%, 60%, 70%, 80%, 90% or greater depending upon the test 
and genotype and other variables), and to continue with acquiring data if the 
genotype is still uncertain. 

20 Once the data criteria have been met, a "YES" outcome at block 75, the 

results are stored in block 76, such as in a database, and the instrument 
adjusted to move to the next sample in block 77. Accordingly, a new sample is 
selected in block 72. 

If the data criteria have not yet been met, a "NO" outcome at block 75, 

25 then block 78 asks if there are any remaining spots on the sample. If unshot 
spots exist, a "NO" outcome at block 78, the instrument is adjusted in block 79 
to acquire data from a new spot, and the data are acquired from the same 
sample at the new spot in block 73. If the data criteria are not met, "NO" at 
block 75, and there are no unshot spots, a "YES" outcome at block 78, then 

30 that particular sample is rejected, and the test moves on to a new sample at 
block 72. 
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Referring now to FIG. 5 r a diagnostic testing method 100 is exemplified. 
The diagnostic testing method is directed to finding a relationship among a set of 
samples that proves a particular biological relationship exists. For example, 
certain clinical diagnostics can look at multiple samples from an individual before 
5 identifying that the individual is at risk for a particular disease, such as cystic 
fibrosis, where identification of multiple markers is needed for a diagnosis to be 
made. The diagnostic testing provided herein permits a clinical diagnosis at a 
high level of certainty and at a high level of efficiency, such as those using mass 
spectrometry-based systems. 

10- The diagnostic testing method 100 receives an assay design and 

relationship criteria at block 101 . The relationship criteria define the range of 
values and certainties where a relationship can be identified. In an exemplary 
embodiment, a relationship is the likelihood or risk that a particular individual will 
contract a particular disease. Since the accuracy of such assessment has 

15 serious consequences, it is crucial that such an identification be made only under 
the most confident conditions. Accordingly, known systems have required 
redundancies and over-testing to build confidence sufficient to make such a 
drastic announcement regarding an individual's health. 

In block 102, a set of samples is identified for testing for the relationship. 

20 As there are can be several, even tens of samples to test, it the set of samples 
can be present on multiple supports. Thus the testing method 100 should 
account for instructing an operator or a robot to deliver and load different 
supports as needed. 

A particular sample is selected from the set in block 103, and data are 

25 acquired from the sample in block 104. The acquired data are evaluated against 
the relationship criteria in block 105. In an exemplary embodiment, testing 
system 10 (FIG. 1) incorporates aspects of the previously discussed testing 
method 70 to increase the confidence that the results from an individual sample 
are robust. The previously discussed method of over-sampling (rastering) a 

30 single biological sample can dramatically increase the confidence in the data 
from a single sample. 
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In block 106, the acquired data are evaluated to determine if it supports 
the object relationship. If the data does not support the object relationship, a 
"NO" outcome, then it is reported that the relationship does not exist in the set 
in block 111, and the test moves on to the next set of samples in block 110. 
5 Due to the high degree of confidence in sample results, it is possible for the 
testing method 1 00 to reject the entire sample and move to the next set. 
Accordingly, the testing method 100 can operate efficiently. 

If block 106 finds that the data does support the relationship, a "YES" 
outcome, then block 107 asks if the data acquired thus far conclusively proves 

10 the relationship exists in accordance with the predefined criteria. If enough data 
has been collected, and the relationship proved, a "YES" outcome at block 107, 
then block 1 1 2 reports that the relationship exists, and the test moves on to the 
next set of samples. Thus, the testing method 100 only takes the necessary 
number of data acquisitions to call a diagnosis, enabling efficient operation. 

15 If block 107 finds that the collected data does not prove the biological 

relationship, a "NO" outcome, then block 108 asks if there are any more 
samples to be tested in the sample set. If no more samples exist, a "NO" 
outcome at block 108, then block 113 reports that the relationship could not be 
proved, and the test moves on to the next sample set at block 110. If there are 

20 more samples to be tested, a "YES" at block 108, then the instrument is 

adjusted to the next sample in block 109, and data are acquired from the new 
sample in block 104. 

FIG. 6 shows an example user display 130 for a test system. The user 
display 130, for example, can be presented on a computer monitor connected to 

25 an IBM compatible computer system, such as the workstation 12 and display 16 
shown in FIG. 1 . In an exemplary embodiment, the user display 130 is 
presented using a Microsoft® Windows 0 compatible display program, such as an 
application program provided herein and that is installed on the workstation 1 2. 
The user display 130 has a spectrum window 132 for displaying a data 

30 spectrum of the most recently acquired data set. The spectrum window 132 
enables an operator to watch, in near real-time, the data collected by the 
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instrument. If multiple spots are shot for a particular sample, each successive 
data spectrum can be displayed in a different color so variations between spots 
is easily identified. 

The user display also has a support representation window or frame 134. 
5 The support representation of FIG. 6 shows individual sample wells in a 
microtiter plate. For example, a well representation shows the wells in a 
physical microtiter plate support. As each well is tested, the well representation 
turns a different color base on whether the sample was accepted or rejected. A 
results display 138 shows assay data and a results quality display 140 shows 
10 run data for data sets. Accordingly, as the test progresses, an operator can 
identify certain systemic problems. For example, if all wells in a particular 
column fail, then there can be a problem with the syringe used to fill that 
particular column. 

The information provided in the results display 138 can include a column 

15 of information containing a well identification number for each well of a sample 
plate or chip, along with an assay identification number that identifies the assay 
profile for the corresponding well. The information can also include, for 
example, in a genotyping assay, a genotype outcome column and a status 
column. The status column can be designed to indicate the degree of 

20 confidence with which the outcome, such as a genotype outcome, is made, if 
applicable. The status is typically indicated with a "conservative" indication, 
meaning a high level of confidence in the genotype call, or an "aggressive" 
indication, meaning low confidence in the genotype call. Other status indicators 
can indicate a moderate level of confidence or data that is insufficient to make a 

25 genotype call within the levels specified by system setup parameters. 

The user interface 130 also has a sample view 136 which shows a live 
image of the sample tested. With this view, an operator can visually identify 
spots that have been used within a particular samplfe. Also, the operator can be 
able to identify certain systemic problems, such as a too small sample deposited 

30 into certain wells. 



WO 02/086794 



PCTYUS02/12903 



-45- 

4. Assay-based judging feedback for modification of data acquisition 
and/or analysis 

Assay-based judging provides numerous advantages in high throughput 
formats and, particularly, in multiplex assays formats. In light of the disclosure 
5 herein, those of skill in the art can envision a variety of such advantages. The 
performance metric tracked is whether an individual assay in a test gives 
acceptable performance. If it does not, it is not used in the criteria for the rest 
of the run. The criteria include, for example, acceptable genotyping, or allele 
frequency determination, or support of a diagnostic conclusion. Another 
10 criterium can be success of a group of assays that support a diagnostic 

conclusion. For example, if five assays are heed to support the conclusion, and 
it is determined that one has failed the other four might not be run, even if those 
are good. 

As an example, the performance of a particular assay that is performed a 

15 plurality of times in a run can be used as a heuristic guide to the processor. If 
an assay or set of assays fails on a particular sample, the failure can indicate a 
problem with the assay or with the processing of that sample, or the problem 
can extend to all samples in the sample set. If the problem exists for all samples 
in the set, the failed assay can be removed from the determination of success or 

20 failure for the set of criteria used to evaluate the success or failure of the sample 
measurement. Hence when measuring a set of samples using assays that are 
common to more than one of the samples in the set, a failed assay can be 
removed. As a result, the throughput speed of the system is improved by using 
assay performance history to determine if additional data should be collected for 

25 a particular assay. 

FIG. 8 is a flowchart that illustrates another embodiment of the assay- 
based judging methods (see, e.g., FIG. 4 and description thereof; see also the 
EXAMPLES) provided herein. In the embodiment depicted in FIG. 8, performance 
history based on assay outcome is evaluated and used to modify data 

30 acquisition. 

In block 225, the system uses the assay performance history to adjust 
the criteria. The system does this by keeping statistics on the performance of all 
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assays that are run on a set of samples. If an assay fails to provide an 
acceptable determination (for example, no genotype determination) for a 
succession of samples, then the system removes that assay from the criteria 
used to determine if more measurements on a sample are necessary. The 
5 number of times an assay can fail before removal from the criteria can be 
adjusted through user input, through the programming user interface of the 
workstation application program. Such number is predetermined; and is a 
function of the assays performed, samples and other parameters as needed. It 
should be noted that the results of assays that have failed continue to be 

10 calculated and stored in a results database. In an exemplary embodiment, if the 
assay begins to give good results, then the assay is returned to the set of criteria 
that is used to determine if additional data acquisition is needed and to control 
the data acquisition. In this way, use of assay performance history improves the 
efficiency of the test system. 

15 With the evaluation criteria selected at block .2.25, the system determines 

if the data collected from the instrument meet the criteria at block 226. If the 
criteria is met, a "YES" outcome at block 226, then at block 227 the acquired 
data are recorded, including an assay performance record. An assay that 
provides a successful result is marked in the performance history as "passed". 

20 The acquired data and performance history record are recorded into a results 
database at block 231. At block 232 the instrument is adjusted for acquiring 
data from a new sample. 

If the evaluation criteria are not met from a sample, a "NO" outcome at 
block 226, then at block 228 the system checks to determine if the maximum 

25 number of data sets have been acquired from the sample. The maximum 

number specifies a limit on the number of raster attempts that is performed on a 
single data sample in an attempt to get a successful outcome for all assays. If 
the maximum number of data sets has been reached, a "YES" outcome at block 
228, then the system marks the failed assays as "failed" and marks any passed 

30 assays as "passed", at block 233. The system then records the acquired data 
and performance history in the database at block 231 and then adjusts the 
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instrument for more data at block 232. If the maximum number of data sets has 
not been reached, a "NO" outcome at block 228, then at block 229 the system 
adjusts the instrument to acquire additional data from the sample [i.e., the 
system rasters the sample). 
5 FIG. 8 thus shows an embodiment, in which in a multiplexed reaction, 

such as one in which 5 assays are run on 384 samples, the system can identify 
a failing reaction. The system starts at the first locus, and can be designed so 
that if the same reaction is failing after a predetermined number of loci, it stops 
rastering for that reaction in all of the remaining samples. Thus, the system can 
10 learn that a reaction is failing and take it out of the criteria. This speeds up the 
processing of the remaining samples. For example, if four out of five of the 
reactions rune well, but a fifth does not, the system can eliminate the fifth from 
consideration in ail samples once the failure is detected. 

15 The following examples are included for illustrative purposes only and are 

not intended to limit the scope of the invention. 

The following examples and the above detailed description depict 
application of the methods and systems (and comparison with prior systems) 
using mass spectrometry. It is understood that mass spectrometers and mass 

20 spectrometry are exemplary of instruments and output methods that can be 

employed in assay based judging systems and methods as provided herein. The 
medium, such as a microtiter plate, for testing in a particular instrument, can be 
adapted for a particular instrument, and include support for retaining or 
containing molecules and samples containing molecules. For high throughput 

25 formats, such supports are generally addressable and contain addressable loci, 
such as positionally addressable target (flat) loci or wells. 

EXAMPLE 1 

Comparative example setting forth steps in prior processes in which the 
data acquisition component (data collection routine(s)) and the biological calling 
30 component (data processing routine) are not integrated as provided herein (see, 
e.g., International PCT application Nos. WO 00/60361 and WO 02/25567): 
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A. First, obtain the data: 

1 . Place a support, such as a chip, or target with one or more 
samples on it into the data acquisition instrument, such as a mass 
spectrometer; 

5 2. Get a locus list, such as a well list, from the user; this 

related to the support used or a set of assays it is just a list of the 
loci, such as wells, to run. The list of loci (i.e. wells} includes the 
. calibrant loci (wells); they are not distinguished as different from 
the other loci they be run and the raw data are saved in the same 
10 way the data obtained from other loci are saved. 

3. The user adjusts the geometry manually by centering one or 
more loci on a mark on the screen. 

4. Collect the data. 

a. Go to the next locus (the first locus on the first time 
15 through this loop); 

b. Measure a raw mass spectrum; 

c. Examine the mass spectrum to see if there are any 
peaks in a fixed mass range; 

d. If there are peaks save the raw spectrum in a file 
20 and go back to step a; 

e. If there are no peaks raster to a new spot on the 
sample and go back to step b; 

The above loop saves the first "good" raster on each sample. Where 
"good" means that using a simple criteria (not biology based) that the mass 
25 spectrum had a peak that had good signal to noise somewhere in a fixed mass 
range window. After this data collection loop is run on a mass spectrometer, 
the system has have a directory full of raw mass spectrum files; there are no 
biological results calculated. 
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B. Then, separately calculate the biological results (genotypes, allele 

frequencies, etc.) 

1 . Copy the raw mass spectrum files from the mass 
spectrometer to a workstation configured for data processing; 
5 2. Get a list of assays for each sample from the database. 

There can be one or more assays for each sample. If there is more 
than one assay assigned to a sample this is referred to as a 
multiplex; 

3. Now calculate the assays results; 
10 a< First get the raw files which were measure from 

calibration wells and calibrate the mass range 
b. For each spectrum file do the following: 

i. Get the assay information for this spectrum and 
calculate the results of each assay represented by this 

15 spectrum. 

ii. Store the assay results in a database 

When performing high throughput assays this way, there is no assay information 
used while running the data acquisition instrument, such as a mass 
spectrometer. There are no biology based results calculated, displayed, nor are 
20 such results used to control the data acquisition instrument, such as a mass 
spectrometer, while it is running. 

Using the biology based results to control the data acquisition instrument, such 
as a mass spectrometer, an improvement data quality is observed. 

Another difference is that when running the methods and using the 

25 systems provide herein (see, e.g., EXAMPLES 2 and 3), the system "knows" if 
there are multiple assays to measure on a sample and it treats each assay 
independently so it is possible to end up with a different spectra for each assay 
in a well. This can happen, for example, if the first assay is measured with high 
quality (conservative) on the first raster but other assay need more rasters to get 

30 a high quality results. The first assay has spectra generated from the first raster 
saved in the database as the raw data for this assay. The other assays for this 
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sample have a different spectra that are a combination of subsequent rasters 
that give a high quality result. 

When doing where the data acquisition and biological calling are not 
integrated, there is only one spectra per sample because the system does not 
5 know there are multiple assays to measure when it is collecting the data. 

EXAMPLE 2 

In contrast to method and system described in EXAMPLE 1 , the following 
is an exemplary test process for the RT operation, such as that depicted in FIG. 
4 (see, also FIG 8 for another embodiment). 
10 1 . Place a support, such as a chip or target with one or more samples 

on it into a mass spectrometer; 

2. Start the run; 

3. Get a list. of samples from the database; and 

4. Get a list of assays for each sample from the database. There can 
15 be one or more assays for each sample. If there is more than one assay 

assigned to a sample this is referred to as a multiplex. 

5. Instruct the mass spectrometer to move to a set of samples (one 
or more) and use the sample visualization system, the framegrabber, and 
the image processing system to determine the position offset from the 

20 ideal target grid. This offset from ideal position is used to correct for 

geometry tolerance in the target and mechanical stage. Each time the 
mass spectrometer is instructed to move to a new sample this offset is 
included in the position of the new sample so that the target is positioned 
accurately with the sample lined up with the laser. 

25 6. Instruct the mass spectrometer to move to one or more calibration 

samples. Use the calibration data to calibrate the mass scale of the 
system. 

7. Now start to measure the samples 

A. Move to the next sample in the list (the first time through 
30 this is the first sample 
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B. Now do the following steps repetitively until satisfactory 
results are achieved for each assay in the list of assays for this sample: 

i. Measure the sample. 

ii. Evaluate each assay in the list of assays for this 
5 sample and calculate a biology based result. This 

result can be a genotype, an allele frequency, etc. 

iii. If any assays in the list do not give a satisfactory 
result move to a new spot on the same sample -this 
small motion is called rastering— and measure the 

10 sample again. 

iv. Take the new data and the previous data and add 
the two spectra together. Now evaluate each assay 
that did not give satisfactory results previously 
again. If the result for these assays is improved by 

15 the new data then keep the new data which is the 

sum of previous measurements. Note: At this point 
we have results of one or more measurements on 
the sample. We can look at this set of 
measurements in a number of ways. In the current 

20 embodiment these measurements are summed if 

there is an improvement due to summing. The 
measurements could also be looked at individually 
and the best could be picked. 

v. If we have not measured the maximum number of 
25 raster positions on this sample then go back to step 

iii. The maximum number of raster positions is a 
parameter of the user interface. 

vi. Save the results for all assays failed or not 

vii. Display the result on the same user interface that is 
30 used to control the machine. The result is displayed 



WO 02/086794 



PCT/US02/12903 



-52- 

in terms of the biological assay or assays that are 
performed. 

C. Go back to step A to get the next sample and continue until 
all samples are processed. 
5 EXAMPLE 3 

A Bruker Biflex instrument was modified to include real-time genotype 
calling capabilities as provided herein. The modifications included the addition 
of a PC workstation equipped with a Signatec PDA500 500 MHz 8-bit digitizer 
and a National Instruments IMAQ-PCI 1411 frame grabber. Four signals were 
10 disconnected from the Biflex and routed to the PC Workstation. These signals 
are: the output of the MCP detector, the trigger for the laser and high voltage 
electronics, the output of a photo diode detector used to trigger the data 
acquisition, and the video signal from the sample visualization camera. The 
output of the detector in the Biflex passed through a gain of five pre-amplifier 
1 5 (Stanford Research Systems SR445) and a passive low pass filter with a cutoff 
frequency of 90 MHz (Mini Circuits BLP-90). In addition, there was a TCP/IP 
connection between the PC workstation and the controlling computer on the 
Biflex (Sun workstation). 

The software on the Sun workstation was modified to accept commands 
20 over the TCP/IP interface to move the stage from sample-to-sample and to 

different raster positions within a sample. The workstation was equipped with 
software that triggers the mass spectrometer (laser and high voltage pulsing) 
and acquires the spectrum. The software also controls stage position. 

The software that was incorporated into the system can control the mass 
25 spectrometer and acquire spectra and process these spectra in real-time. The 
biology based results are used to decide whether or not to raster. The software 
uses the following algorithm. A set number of shots determined by a parameter 
are averaged to create a spectra. Each assay defined for that sample is judged 
independently. If the score for an assay is less than moderate than the system 
30 collects more data for that assay. Another set of shots is averaged and the 

result of this data collection is added to the first. Again, a result for each assay 
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is determined and if the score for that assay improves by adding the new shots, 
the new sum is kept. The process continues until a set number of attempts has 
expired or a score of moderate or better is achieved for each assay in the 
sample. It is possible that each assay in the well ends up with a different 
5 spectra. 

In a typical experiment, 48 new non-established (this was the first run for 
each of these assays) 4-plexes were performed on eight different DNA's (384 
reactions) and spotted on a 384 chip. The same 384 chip was measured 
consecutively on three different §iflex instruments. The first run was performed 

10 using the standard Biflex Autoxecute software. The standard acquisition 

available with the Biflex uses fuzzy logic to control rastering based on resolution 
and signal to noise ratio over a fixed mass range. The next two runs were 
performed on Biflex instruments equipped as provided herein. Normally the 
quality of the data would decrease in consecutive runs because the sample is 

15 depleted by successive laser shots. The results are presented in Table 1, FIG. 9 
and FIG. 10). FIG. 9 is a diagram that shows the spectra acquired and data 
outcome from data acquisition without the assay-based rastering control 
provided herein. FIG. 10 is a diagram that shows the spectra acquired and data 
outcome from data acquisition using the biology-dependent rastering control 

20 provided herein 

FIG. 9 shows a spectrum that contains the results from four assays using 
the "standard" configuration instrument system in which data are processed 
without benefit of the assay-based judging described herein. The system made 
two conservative calls and two aggressive calls and found at least one peak of 

25 sufficient quality to use this spectra output and move to the next sample. In 
contrast, using assay-based processing provided herein showed that the data 
was not sufficient to provide the desired four, high-quality genotyping results. 
As a result, additional data acquisition resulted in four conservative calls, as 
illustrated in FIG. 10. 

30 Specifically, FIG. 9 shows a spectra acquired by the Biflex using its 

standard judging algorithms. This spectrum contains the results for four assays. 



WO 02/086794 



PCT/US02/12903 



-54- 

In this case, the data resulted in two conservative calls and two aggressive calls. 
As can be seen, the spectra is complex and the simple judging employed by the 
Bruker found at least one peak of sufficient quality to save this spectra and 
move on. FIG. 10 shows the spectra acquired using assay based judging. In 
5 this case, the data resulted in four conservative calls. FIG. 10 depicts spectra 
acquired using real-time genotyping to control the acquisition. 48 new non- 
established (this was the first run for each of these assays) 4-piexes were 
performed on eight different DNA's (384 reactions) and spotted on a 384 chip. 
The same 384 chip was measured consecutively on three different Biflex 

10 instruments. The first run was performed using the standard Biflex Autoxecute 
software. The standard acquisition uses fuzzy logic to control rastering based 
on resolution and signal to noise ratio over a fixed mass range. The next two 
runs were performed on modified Biflex instruments. The modifications were as 
described in the experimental section above. Normally the quality of the data 

15 would decrease in consecutive runs because the sample is depleted by 
successive laser shots. The results are presented in the Table. 
Table 1 



Call quality 


Run 1 


Run 2 


Run 3 




Standard 


Assay based 


Assay based 




configuration 






Total possible calls 


1536 


1536 


1536 


Conservative calls 


1062 


1310 


1 199 


Moderate calls 


121 


86 


167 


Aggressive calls 


90 


5 


23 


Low probability 


140 


58 


98 


Bad spectrum 


123 . 


77 


98 


Total "good" calls . 


1183 


1396 


1366 


Improvement in efficiency 


N/A 


18% 


15.5% 



over Standard configuration 

In the Table, a "good" call is defined as the total of conservative calls plus the moderate 
calls. 

30 The results in Table 3 show that the call efficiency was improved by 

using assay-based judging in accord with the methods and systems provided 
herein to control data acquisition. In particular, overall call efficiency was 
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improved from 77% for the "standard" configuration to 90.9% in the first data 
run using the assay-based judging. 

Since modifications is apparent to those of skill in this art, it is intended 
5 that this invention be limited only by the scope of the appended claims. 
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WHAT IS CLAIMED: 

1 . A system for performing a biological assay and assay-based 
judging, comprising: 

an instrument for detecting molecules in samples; and 
5 a processor that communicates with the instrument to provide results- 

based control of the instrument to effect assay-based judging. 

2. The system of claim 1 , wherein: 

the instrument is configured to acquire biological data from a biological 
sample and to display real-time results from assays performed on the sample; 
10 and 

the processor, comprises: 

a computer-directed data collection processing routine; and 
a computer-directed data processing routine, wherein the data 
collection and data processing routines are integrated to effect the assay-based 
15 judging. 

3. The system of claim 1, wherein the processor comprises 
programs for data collection and data analysis that are integrated by a computer- 
based calling component, whereby the instrument can provides real time (RT) 
results. 

20 4 - The system of claim 2, wherein integration is effected by a calling 

component that identifies a data result to make decisions regarding further data 
acquisition responsive to biological results. 

5. The system of claim 3, wherein the processor comprises a 
programming interface that controls a dialog between the data acquisition 

25 instrument and the calling component. 

6. The system of claim 1 , wherein the processor directs the 
instrument to acquire data indicative of the biological sample, establishes a data 
spectrum criteria, generates data parameters using the acquired data, compares 
the data parameters to the data spectrum criteria, adjusts the instrument 

30 responsive to the data comparison, and directs the instrument to acquire other 
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data for the biological assay, wherein predetermined data criteria relate to the 
biological assay with respect to the biological sample. 

7. A system of claim 1, wherein the processor receives an assay 
design and repeatedly acquires data in accordance with the assay design. 
5 8. A system of claim 1, further including a database in 

communication with the processor, wherein the database stores assay 
information. 

9. A system of claim 8, wherein the processor further receives a 
portion of the assay information from the database and uses the received portion 

10 of the assay information to adjust the instrument. 

10. A system of claim 1 , wherein the instrument is configured as a 
mass spectrometer. 

1 1 A system of claim 1 , wherein the processor is configured as a 
computer device coupled to the instrument. 
15 12. A system of claim 1 , wherein the processor generates the data 

parameters by generating a data parameter indicative of standard deviation for a 
characteristic of the acquired data. 

13. A system of claim 1 , wherein the processor generates the data 
parameters by generating a data parameter indicative of statistical probability. 
20 14. A system of claim 1 , wherein the processor generates the data 

parameters by generating a data parameter indicative of allele probability. 

15. A system of claim 1 , wherein the assays result is a diagnosis. 

16. A system of claim 1 , wherein the assays result is a genotype. 

17. A system of claim 2, wherein the assays result is a diagnosis. 
25 18. A system of claim 2, wherein the assays result is a genotype. 

19, A system for testing a biological sample, comprising: 
an instrument configured to acquire biological data from the biological 
sample; 

a processor communicating to the instrument, the processor performing 
30 steps comprising: 
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directing the instrument to acquire data indicative of the biological 
sample; 

evaluating the acquired data to produce a result; 

automatically adjusting the instrument responsive to evaluating the data; 

5 and 

directing the instrument to acquire other data indicative of the biological 
sample. 

20. The system of claim 19, wherein the instrument is a mass 
spectrometer. 

10 21 . The system of claim 20, wherein the processor further performs 

operations comprising: 

establishing a spectral criteria; and 
evaluating the acquired data using the spectral criteria. 
22. A system for performing a diagnostic assay using a set of 
15 biological samples, comprising: 

an instrument configured to acquire biological data from the biological 
samples; 

a processor communicating to the instrument, the processor performing 
the steps comprising: 

20 directing the instrument to acquire data indicative of one of the biological 

samples in the set; 

evaluating the acquired data; 

determining if the acquired data supports a diagnostic conclusion; and 
directing the instrument to acquire data indicative of a next one of the 
25 biological samples in the set responsive to the determining step. 
% 23. The system of claim 21 , wherein the instrument is a mass 

spectrometer. 

24. A system for performing a diagnostic assay using a set of 
biological samples, the system comprising: 
30 a workstation that communicates with an instrument that is configured to 

acquire biological data from successive biological samples in the set, and that 
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controls the instalment to acquire data indicative of each successive biological 
sample, determines if the instrument should be adjusted in response to 
evaluating the acquired data from a set, and directs the instrument to acquire 
other data indicative of the biological sample responsive to the determination; 
5 and 

a database that stores the acquired data from the biological samples. 

25. The system of claim 24, wherein the instrument is a mass 
spectrometer. 

26. The system of claim 24, wherein the workstation evaluates the 
10 acquired data, determines if the acquired data supports a diagnostic conclusion, 

and directs the instrument to acquire data indicative of a next one of the 
biological samples in the set, responsive to the determination. 

27. The system of claim 26, wherein the workstation directs the 
instrument to acquire other data for a next sample if the previously acquired data 

15 supports the diagnostic condition or if a maximum data acquiring number has 
been exceeded, and otherwise directs the instrument to acquire other data for 
the same sample. 

28. The system of claim 24, wherein the database is maintained by a 
database server. 

20 29 ■ The system of claim 24, wherein the workstation includes an 

assay controller that acquires assay design specifications from the database 
server. 

30. The system of claim 24, wherein the workstation includes a data 
acquisition controller that automatically aligns a laser of the instrument on one of 

25 the biological samples and controls movement of the sample in the instrument so 
as to receive biological data from the instrument. 

31 . The system of claim 30, wherein the workstation includes a data 
analysis controller that receives a data signal from the data acquisition controller 
and makes the determination of directing the instrument to acquire other data 

30 indicative of the biological sample, in response to the data signal. 
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32. The system of claim 24, wherein the workstation is constructed 
integrally with the instrument. 

33. The system of claim 24, wherein the workstation evaluates the 
acquired data with respect to a biological result. 

5 34. The system of claim 33, wherein the biological result is a 

determination of a genotype of a sample. 

35. A method of performing a diagnostic assay using a set of 
biological samples, the method comprising: 

directing an instrument to acquire data indicative of one of the biological 
10 samples in the set; 

evaluating the acquired data; 

determining if the acquired data supports a diagnostic conclusion; and 

responsive to the determination, directing the instrument to acquire data 
indicative of a next one of the biological samples in the set. 
15 36. The method of claim 35, further comprising: 

establishing a data spectrum criteria; 

generating data parameters using the acquired data; 

comparing the data parameters to the spectrum criteria, and 

adjusting the instrument responsive to evaluating the data. 
20 37. The method of claim 36, wherein generating the data parameters 

includes generating a data parameter indicative of standard deviation for a 
characteristic of the acquired data. 

38. The method of claim 36, wherein generating the data parameters 
includes generating a data parameter indicative of statistical probability. 
25 39. The method of claim 34, wherein generating the data parameters 

includes generatirig a data parameter indicative of allele probability. 

40. The method of claim 36, further including receiving an assay 

design. 

41 . The method of claim 36, further including storing the acquired 
30 data from the biological samples of the set in a database server. 

42. The method of claim 41, further including: 
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receiving a portion of the assay information from the database server; and 
using the received portion of the assay information to adjust the operation 
of the instrument. 

43. The method of claim 36, wherein the instrument is a mass 
5 spectrometer. 

44. The method of claim 36, wherein: 

performance history of an assay in a multiplex format is evaluated; and 
processing of the assay is adjusted in response to the performance 
history. 

10 45. The method of claim 44, wherein, if the assay fails for a 

predetermined number of samples, it is eliminated from consideration. 

46. The method of claim 44, wherein, if the assay fails for a 
predetermined number of samples, it is performed in the remaining samples but 
is not rastered in the remaining samples. 
15 47. The method of claim 44, wherein assay processing is adjusted 

responsive to a failure of the assay to provide a biological result for a 
predetermined number of samples. 

48. The method of claim 47, wherein, if the assay fails for a 
predetermined number of samples, it is eliminated from consideration. 
20 49. The method of claim 48, wherein, if the assay fails for a 

predetermined number of samples, it is performed in the remaining samples but 
is not rastered in the remaining samples. 

50. A method for performing biological assays -employing assay-based 
judging, comprising: 
25 a) introducing a solid support containing one or a plurality of 

samples into the instrument of a system of claim 1 , and commencing to assay a 
sample on the support; 

b) for each sample: 

i) measuring the sample by performing assays on each 
30 sample and calculating a biology based result; 
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ii) if any assays do not give a result, moving to a new 
spot on the same sample, and measuring the sample 
again; 

iii > comparing the new data and the previous data; if the 
5 result for these assays is improved by the new data then 

• keeping the new data; 

iv) if the maximum number of a predetermined number 
of raster positions on the sample have not been 
measured, repeating the process from step ii); 
10 v) saving the results for all assays failed or not failed; 

vi) displaying the biological result on a user interface; 
and 

c) repeating step a) for the ext sample and continuing until all 
samples are processed. 
15 51 . The method of claim 50, wherein prior to step a): 

i) a support with one or more samples on it is introduced 
into the instrument; into a mass spectrometer; 

ii) a list of samples and a list of assay are obtained from 

the database; and 
20 iii) the instrument is calibrated. 

52, The method of claim 50, wherein the instrument is a mass 
spectrometer. 

53. The method of claim 50, wherein the biological result is a 
determination of risk of developing a disease or condition. 

25 54. The method of claim 41 , wherein the biological result is a 

diagnosis. 

55. The method of claim 41 , wherein the biological result is a 
genotype. 

56. The method of claim 41 , wherein the biological result is an allelic 
30 frequency. 
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57. The system of claim 1 , wherein the result for the results-based 
control is a determination of risk of developing a disease or condition. 

58. The system of claim 1 , wherein the result for the results-based 
control is a diagnosis. 

5 59. The system of claim 1 , wherein the result for the results-based 

control is a genotype. 

60. The system of claim 1 , wherein the result for the results-based 
control is an allelic frequency. 

61 . The system of claim- 1 9, wherein the result is a determination of 
10 risk of developing a disease or condition. 

62. The system of claim 1 , wherein the result is a diagnosis. 

63. The system of claim 1 , wherein the result is a genotype. 

64. The system of claim 1 , wherein the result is an allelic frequency. 

65. A mass spectrometry system that displays the diagnostic outcome 
15 of an assay, wherein: 

the diagnostic outcome is a biological result; and 

the display occurs in real-time with respect to the measurement. 

66. The system of claim 48, wherein the display is a genotype, allelic 
frequency, a determination of risk of developing a disease or condition or a 

20 diagnosis. 
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